Core API
culicidaelab.core
    Core components of the CulicidaeLab library.
This module provides the base classes, configuration management, and resource handling functionalities that form the foundation of the library. It exports key classes and functions for convenient access from other parts of the application.
Attributes:
| Name | Type | Description | 
|---|---|---|
| __all__ | list[str] | A list of the public objects of this module. | 
__all__ = ['BasePredictor', 'BaseProvider', 'WeightsManagerProtocol', 'BaseInferenceBackend', 'ConfigManager', 'CulicidaeLabConfig', 'PredictorConfig', 'DatasetConfig', 'ProviderConfig', 'SpeciesModel', 'SpeciesConfig', 'BoundingBox', 'Detection', 'DetectionPrediction', 'SegmentationPrediction', 'Classification', 'ClassificationPrediction', 'ProviderService', 'ResourceManager', 'Settings', 'get_settings', 'download_file']
  
      module-attribute
  
    
BasePredictor
    Abstract base class for all predictors.
This class defines the common interface for all predictors (e.g., detector, segmenter, classifier). It relies on the main Settings object for configuration and a backend for model execution.
Attributes:
| Name | Type | Description | 
|---|---|---|
| settings | Settings | The main settings object for the library. | 
| predictor_type | str | The key for this predictor in the configuration (e.g., 'classifier'). | 
| backend | BaseInferenceBackend | An object that inherits from BaseInferenceBackend for model loading and inference. | 
Source code in culicidaelab\core\base_predictor.py
                | 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 |  | 
settings = settings
  
      instance-attribute
  
    
predictor_type = predictor_type
  
      instance-attribute
  
    
backend = backend
  
      instance-attribute
  
    
config: PredictorConfig
  
      property
  
    Get the predictor configuration Pydantic model.
Returns:
| Name | Type | Description | 
|---|---|---|
| PredictorConfig | PredictorConfig | The configuration object for this predictor. | 
model_loaded: bool
  
      property
  
    Check if the model is loaded.
Returns:
| Name | Type | Description | 
|---|---|---|
| bool | bool | True if the model is loaded, False otherwise. | 
__init__(settings: Settings, predictor_type: str, backend: BaseInferenceBackend, load_model: bool = False)
    Initializes the predictor.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| settings | Settings | The main Settings object for the library. | required | 
| predictor_type | str | The key for this predictor in the configuration (e.g., 'classifier'). | required | 
| backend | BaseInferenceBackend | An object that inherits from BaseInferenceBackend for model loading and inference. | required | 
| load_model | bool | If True, loads the model immediately upon initialization. | False | 
Source code in culicidaelab\core\base_predictor.py
              
__call__(input_data: InputDataType, **kwargs: Any) -> Any
    Convenience method that calls predict().
This allows the predictor instance to be called as a function.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| input_data | InputDataType | The input data for the prediction. | required | 
| **kwargs | Any | Additional arguments to pass to the  | {} | 
Returns:
| Name | Type | Description | 
|---|---|---|
| Any | Any | The result of the prediction. | 
Source code in culicidaelab\core\base_predictor.py
              
__enter__()
    Context manager entry.
Loads the model if it is not already loaded.
Returns:
| Name | Type | Description | 
|---|---|---|
| BasePredictor | The predictor instance. | 
__exit__(exc_type, exc_val, exc_tb)
    Context manager exit.
This default implementation does nothing, but can be overridden to handle resource cleanup.
model_context()
    A context manager for temporary model loading.
Ensures the model is loaded upon entering the context and unloaded upon exiting if it was not loaded before. This is useful for managing memory in pipelines.
Yields:
| Name | Type | Description | 
|---|---|---|
| BasePredictor | The predictor instance itself. | 
Example
with predictor.model_context(): ... predictions = predictor.predict(data)
Source code in culicidaelab\core\base_predictor.py
              
evaluate(ground_truth: GroundTruthType, prediction: PredictionType | None = None, input_data: InputDataType | None = None, **predict_kwargs: Any) -> dict[str, float]
    Evaluate a prediction against a ground truth.
Either prediction or input_data must be provided. If prediction
is provided, it is used directly. If prediction is None, input_data
is used to generate a new prediction.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| ground_truth | GroundTruthType | The ground truth annotation. | required | 
| prediction | PredictionType | A pre-computed prediction. | None | 
| input_data | InputDataType | Input data to generate a prediction from, if one isn't provided. | None | 
| **predict_kwargs | Any | Additional arguments passed to the  | {} | 
Returns:
| Type | Description | 
|---|---|
| dict[str, float] | dict[str, float]: Dictionary containing evaluation metrics for a | 
| dict[str, float] | single item. | 
Raises:
| Type | Description | 
|---|---|
| ValueError | If neither  | 
Source code in culicidaelab\core\base_predictor.py
              
evaluate_batch(ground_truth_batch: Sequence[GroundTruthType], predictions_batch: Sequence[PredictionType] | None = None, input_data_batch: Sequence[InputDataType] | None = None, num_workers: int = 1, show_progress: bool = False, **predict_kwargs: Any) -> dict[str, Any]
    Evaluate on a batch of items using parallel processing.
Either predictions_batch or input_data_batch must be provided.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| ground_truth_batch | Sequence[GroundTruthType] | List of corresponding ground truth annotations. | required | 
| predictions_batch | Sequence[PredictionType] | A pre-computed list of predictions. | None | 
| input_data_batch | Sequence[InputDataType] | List of input data to generate predictions from. | None | 
| num_workers | int | Number of parallel workers for calculating metrics. | 1 | 
| show_progress | bool | Whether to show a progress bar. | False | 
| **predict_kwargs | Any | Additional arguments passed to  | {} | 
Returns:
| Type | Description | 
|---|---|
| dict[str, Any] | dict[str, Any]: Dictionary containing aggregated evaluation metrics. | 
Raises:
| Type | Description | 
|---|---|
| ValueError | If the number of predictions does not match the number of ground truths, or if required inputs are missing. | 
Source code in culicidaelab\core\base_predictor.py
              
get_model_info() -> dict[str, Any]
    Gets information about the loaded model.
Returns:
| Type | Description | 
|---|---|
| dict[str, Any] | dict[str, Any]: A dictionary containing details about the model, such | 
| dict[str, Any] | as architecture, path, etc. | 
Source code in culicidaelab\core\base_predictor.py
              
load_model() -> None
    Delegates model loading to the configured backend.
Source code in culicidaelab\core\base_predictor.py
              
predict(input_data: InputDataType, **kwargs: Any) -> PredictionType
    Makes a prediction on a single input data sample.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| input_data | InputDataType | The input data (e.g., an image as a NumPy array) to make a prediction on. | required | 
| **kwargs | Any | Additional predictor-specific arguments. | {} | 
Returns:
| Name | Type | Description | 
|---|---|---|
| PredictionType | PredictionType | The prediction result, with a format specific to the | 
| PredictionType | predictor type. | 
Raises:
| Type | Description | 
|---|---|
| RuntimeError | If the model is not loaded before calling this method. | 
Source code in culicidaelab\core\base_predictor.py
              
predict_batch(input_data_batch: Sequence[InputDataType], show_progress: bool = False, **kwargs: Any) -> list[PredictionType]
    Makes predictions on a batch of inputs by delegating to the backend.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| input_data_batch | Sequence[InputDataType] | A sequence of inputs. | required | 
| show_progress | bool | If True, displays a progress bar. | False | 
| **kwargs | Any | Additional arguments for the backend's  | {} | 
Returns:
| Type | Description | 
|---|---|
| list[PredictionType] | list[PredictionType]: A list of prediction results. | 
Source code in culicidaelab\core\base_predictor.py
              
unload_model() -> None
    
visualize(input_data: InputDataType, predictions: PredictionType, save_path: str | Path | None = None) -> np.ndarray
  
      abstractmethod
  
    Visualizes the predictions on the input data.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| input_data | InputDataType | The original input data (e.g., an image). | required | 
| predictions | PredictionType | The prediction result obtained from
the  | required | 
| save_path | str | Path | An optional path to save the visualization to a file. | None | 
Returns:
| Type | Description | 
|---|---|
| ndarray | np.ndarray: A NumPy array representing the visualized image. | 
Source code in culicidaelab\core\base_predictor.py
              
BaseProvider
    Abstract base class for all data and model providers.
This class defines the contract for providers that fetch resources like datasets and model weights.
Source code in culicidaelab\core\base_provider.py
                
download_dataset(dataset_name: str, save_dir: Path | None = None, *args: Any, **kwargs: Any) -> Path
  
      abstractmethod
  
    Downloads a dataset from a source.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| dataset_name | str | The name or identifier of the dataset to download. | required | 
| save_dir | Path | None | The directory to save the dataset. If None, a default directory may be used. Defaults to None. | None | 
| *args | Any | Additional positional arguments for the provider's implementation. | () | 
| **kwargs | Any | Additional keyword arguments for the provider's implementation. | {} | 
Returns:
| Name | Type | Description | 
|---|---|---|
| Path | Path | The path to the downloaded dataset directory or file. | 
Raises:
| Type | Description | 
|---|---|
| NotImplementedError | If the method is not implemented by a subclass. | 
Source code in culicidaelab\core\base_provider.py
              
download_model_weights(repo_id: str, filename: str, local_dir: Path, *args: Any, **kwargs: Any) -> Path
  
      abstractmethod
  
    Downloads model weights and returns the path to them.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| repo_id | str | The repository ID from which to download the model (e.g., 'culicidae/mosquito-detector'). | required | 
| filename | str | The name of the weights file in the repository. | required | 
| local_dir | Path | The local directory to save the weights file. | required | 
| *args | Any | Additional positional arguments for the provider's implementation. | () | 
| **kwargs | Any | Additional keyword arguments for the provider's implementation. | {} | 
Returns:
| Name | Type | Description | 
|---|---|---|
| Path | Path | The local path to the downloaded model weights file. | 
Raises:
| Type | Description | 
|---|---|
| NotImplementedError | If the method is not implemented by a subclass. | 
Source code in culicidaelab\core\base_provider.py
              
get_provider_name() -> str
  
      abstractmethod
  
    Gets the unique name of the provider.
Returns:
| Name | Type | Description | 
|---|---|---|
| str | str | A string representing the provider's name (e.g., 'huggingface'). | 
load_dataset(dataset_path: str | Path, **kwargs: Any) -> Any
  
      abstractmethod
  
    Loads a dataset from a local path.
This method is responsible for loading a dataset that has already been downloaded to the local filesystem.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| dataset_path | str | Path | The local path to the dataset, typically
a path returned by  | required | 
| **kwargs | Any | Additional keyword arguments for loading the dataset, which may vary by provider and dataset format. | {} | 
Returns:
| Name | Type | Description | 
|---|---|---|
| Any | Any | The loaded dataset object, which could be a Hugging Face Dataset, | 
| Any | a PyTorch Dataset, a Pandas DataFrame, or another format. | 
Raises:
| Type | Description | 
|---|---|
| NotImplementedError | If the method is not implemented by a subclass. | 
Source code in culicidaelab\core\base_provider.py
              
WeightsManagerProtocol
    Source code in culicidaelab\core\weights_manager_protocol.py
                
ensure_weights(predictor_type: str, backend_type: str) -> Path
    Ensures model weights are available locally and returns their path.
This method is responsible for managing model weight files, including checking their existence, downloading if necessary, and providing the absolute path to the weights file. It abstracts away the details of weight file management from the rest of the system.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| predictor_type | str | The type of predictor requiring the weights. Common values include 'classifier', 'detector', or 'segmenter'. | required | 
| backend_type | str | The backend framework for which the weights are needed. Examples include 'fastai', 'onnx', 'yolo', or 'sam'. | required | 
Returns:
| Name | Type | Description | 
|---|---|---|
| Path | Path | Absolute path to the model weights file. The returned path is guaranteed to exist and be accessible. | 
Example
from your_module import WeightsManager
weights_manager = WeightsManager()
# Get weights for a FastAI classifier
classifier_weights = weights_manager.ensure_weights(
    predictor_type="classifier",
    backend_type="fastai"
)
# Use the weights in a model
model.load_state_dict(torch.load(classifier_weights))
Note
Implementations should handle various scenarios such as: - Checking if weights exist locally - Downloading weights from remote sources if needed - Validating weight file integrity - Managing weight file versions - Handling download failures and retry logic
Source code in culicidaelab\core\weights_manager_protocol.py
              
BaseInferenceBackend
    Abstract base class for an inference backend.
This class defines the required methods for an inference backend, which is responsible for loading a model and running predictions. It includes a default implementation for batch prediction that iterates through single predictions.
Attributes:
| Name | Type | Description | 
|---|---|---|
| predictor_type | str | The type of predictor this backend serves (e.g., 'classifier'). | 
| model | Any | The loaded model object. Initially None. | 
Source code in culicidaelab\core\base_inference_backend.py
                | 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 |  | 
predictor_type = predictor_type
  
      instance-attribute
  
    
model: Any = None
  
      instance-attribute
  
    
is_loaded: bool
  
      property
  
    Checks if the model is loaded into memory.
Returns:
| Type | Description | 
|---|---|
| bool | True if the model is loaded, False otherwise. | 
__init__(predictor_type: str)
    Initializes the BaseInferenceBackend.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| predictor_type | str | The type of predictor (e.g., 'classifier', 'detector'). | required | 
Source code in culicidaelab\core\base_inference_backend.py
              
            
load_model(**kwargs: Any) -> None
  
      abstractmethod
  
    Loads the model into memory.
This method should handle all aspects of model loading, such as reading weights from a file and preparing the model for inference.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| **kwargs | Any | Backend-specific arguments for model loading. | {} | 
Source code in culicidaelab\core\base_inference_backend.py
              
predict(input_data: InputDataType, **kwargs: Any) -> PredictionType
  
      abstractmethod
  
    Runs a prediction on a single input.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| input_data | InputDataType | The data to be processed by the model. | required | 
| **kwargs | Any | Additional backend-specific arguments for prediction. | {} | 
Returns:
| Type | Description | 
|---|---|
| PredictionType | The prediction result. | 
Source code in culicidaelab\core\base_inference_backend.py
              
unload_model() -> None
    Unloads the model and releases resources.
This method is intended to free up memory (especially GPU memory) by deleting the model instance.
Source code in culicidaelab\core\base_inference_backend.py
              
            
predict_batch(input_data_batch: list[InputDataType], show_progress: bool = False, **kwargs: Any) -> list[PredictionType]
    Makes predictions on a batch of inputs.
This method provides a default implementation that iterates through the batch
and calls predict for each item. Backends that support native batching
should override this method for better performance.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| input_data_batch | list[InputDataType] | A list of inputs to process. | required | 
| show_progress | bool | If True, displays a progress bar. | False | 
| **kwargs | Any | Additional arguments to pass to the  | {} | 
Returns:
| Type | Description | 
|---|---|
| list[PredictionType] | A list of prediction results. | 
Source code in culicidaelab\core\base_inference_backend.py
              
ConfigManager
    Handles loading, merging, and validating configurations for the library.
This manager implements a robust loading strategy: 1. Loads default YAML configurations bundled with the library. 2. Loads user-provided YAML configurations from a specified directory. 3. Merges the user's configuration on top of the defaults. 4. Validates the final merged configuration against Pydantic models.
Attributes:
| Name | Type | Description | 
|---|---|---|
| user_config_dir | Path | None | The user configuration directory. | 
| default_config_path | Path | The path to the default config directory. | 
| config | CulicidaeLabConfig | The validated configuration object. | 
Source code in culicidaelab\core\config_manager.py
                | 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 |  | 
user_config_dir = Path(user_config_dir) if user_config_dir else None
  
      instance-attribute
  
    
default_config_path = self._get_default_config_path()
  
      instance-attribute
  
    
config: CulicidaeLabConfig = self._load()
  
      instance-attribute
  
    
__init__(user_config_dir: str | Path | None = None)
    Initializes the ConfigManager.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| user_config_dir | str | Path | Path to a directory containing user-defined YAML configuration files. These will override the defaults. Defaults to None. | None | 
Source code in culicidaelab\core\config_manager.py
              
get_config() -> CulicidaeLabConfig
    Returns the fully validated Pydantic configuration object.
Returns:
| Name | Type | Description | 
|---|---|---|
| CulicidaeLabConfig | CulicidaeLabConfig | The  | 
instantiate_from_config(config_obj: Any, extra_params: dict[str, Any] | None = None, **kwargs: Any) -> Any
    Instantiates a Python object from its Pydantic config model.
The config model must have a target field specifying the fully
qualified class path (e.g., 'my_module.my_class.MyClass').
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| config_obj | Any | A Pydantic model instance (e.g., a predictor config). | required | 
| extra_params | dict[str, Any] | None | A dictionary of extra parameters to inject into the constructor. Defaults to None. | None | 
| **kwargs | Any | Additional keyword arguments to pass to the object's constructor, overriding any existing parameters in the config. | {} | 
Returns:
| Name | Type | Description | 
|---|---|---|
| Any | Any | An instantiated Python object. | 
Raises:
| Type | Description | 
|---|---|
| ValueError | If the  | 
| ImportError | If the class could not be imported and instantiated. | 
Source code in culicidaelab\core\config_manager.py
              
save_config(file_path: str | Path) -> None
    Saves the current configuration state to a YAML file.
This is useful for exporting the fully merged and validated config.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| file_path | str | Path | The path where the YAML config will be saved. | required | 
Source code in culicidaelab\core\config_manager.py
              
CulicidaeLabConfig
    The root Pydantic model for all CulicidaeLab configurations.
This model validates the entire configuration structure after it is loaded from YAML files, serving as the single source of truth for all settings.
Attributes:
| Name | Type | Description | 
|---|---|---|
| config_version | str | The version of the configuration schema. This is used to ensure compatibility with the library version. | 
| app_settings | AppSettings | Core application settings. | 
| processing | ProcessingConfig | Default processing parameters. | 
| datasets | dict[str, DatasetConfig] | A mapping of dataset names to their configs. | 
| predictors | dict[str, PredictorConfig] | A mapping of predictor names to their configs. | 
| providers | dict[str, ProviderConfig] | A mapping of provider names to their configs. | 
| species | SpeciesModel | Configuration and metadata related to all species. | 
Source code in culicidaelab\core\config_models.py
                
model_config = ConfigDict(extra='allow')
  
      class-attribute
      instance-attribute
  
    
config_version: str = Field(default=CONFIG_SCHEMA_VERSION)
  
      class-attribute
      instance-attribute
  
    
app_settings: AppSettings = Field(default_factory=AppSettings)
  
      class-attribute
      instance-attribute
  
    
processing: ProcessingConfig = Field(default_factory=ProcessingConfig)
  
      class-attribute
      instance-attribute
  
    
datasets: dict[str, DatasetConfig] = Field(default_factory=dict)
  
      class-attribute
      instance-attribute
  
    
predictors: dict[str, PredictorConfig] = Field(default_factory=dict)
  
      class-attribute
      instance-attribute
  
    
providers: dict[str, ProviderConfig] = Field(default_factory=dict)
  
      class-attribute
      instance-attribute
  
    
species: SpeciesModel = Field(default_factory=SpeciesModel)
  
      class-attribute
      instance-attribute
  
    
DatasetConfig
    Configuration for a single dataset.
Attributes:
| Name | Type | Description | 
|---|---|---|
| name | str | The unique internal name for the dataset. | 
| path | str | The local directory path for storing the dataset. | 
| format | str | The dataset format (e.g., "imagefolder", "coco", "yolo"). | 
| classes | list[str] | A list of class names present in the dataset. | 
| provider_name | str | The name of the data provider (e.g., "huggingface"). | 
| repository | str | The repository ID on the provider's platform. | 
| config_name | str | None | The specific configuration of a Hugging Face dataset. | 
| derived_datasets | list[str] | None | A list of Hugging Face repository IDs for datasets that were derived from this one. Defaults to None. | 
| trained_models_repositories | list[str] | None | A list of Hugging Face repository IDs for models trained on this dataset. Defaults to None. | 
Source code in culicidaelab\core\config_models.py
                
model_config = ConfigDict(extra='allow')
  
      class-attribute
      instance-attribute
  
    
name: str
  
      instance-attribute
  
    
path: str
  
      instance-attribute
  
    
format: str
  
      instance-attribute
  
    
classes: list[str]
  
      instance-attribute
  
    
provider_name: str
  
      instance-attribute
  
    
repository: str
  
      instance-attribute
  
    
config_name: str | None = 'default'
  
      class-attribute
      instance-attribute
  
    
derived_datasets: list[str] | None = None
  
      class-attribute
      instance-attribute
  
    
trained_models_repositories: list[str] | None = None
  
      class-attribute
      instance-attribute
  
    
PredictorConfig
    Configuration for a single inference predictor.
This model defines how to load and use a specific pre-trained model for inference.
Attributes:
| Name | Type | Description | 
|---|---|---|
| target | str | The fully qualified import path to the predictor class
(e.g.,  | 
| confidence | float | The default confidence threshold for this predictor. | 
| device | str | The compute device to use ("cpu" or "cuda"). | 
| backend | str | None | The specific inference backend to use (e.g., 'yolo'). | 
| params | dict[str, Any] | A dictionary of extra parameters to pass to the predictor's constructor. | 
| repository_id | str | None | The Hugging Face Hub repository ID for the model. | 
| weights | dict[str, WeightDetails] | None | A mapping of backend names to their weight details. | 
| provider_name | str | None | The name of the provider (e.g., "huggingface"). | 
| model_arch | str | None | The model architecture name (e.g., "yolov8n-seg"). | 
| model_config_path | str | None | The path to the model's specific config file. | 
| model_config_filename | str | None | The filename of the model's config. | 
| visualization | VisualizationConfig | Custom visualization settings for this predictor. | 
Source code in culicidaelab\core\config_models.py
                
model_config = ConfigDict(extra='allow', protected_namespaces=())
  
      class-attribute
      instance-attribute
  
    
target: str = Field(..., alias='target')
  
      class-attribute
      instance-attribute
  
    
confidence: float = 0.5
  
      class-attribute
      instance-attribute
  
    
device: str = 'cpu'
  
      class-attribute
      instance-attribute
  
    
backend: str | None = None
  
      class-attribute
      instance-attribute
  
    
params: dict[str, Any] = Field(default_factory=dict)
  
      class-attribute
      instance-attribute
  
    
repository_id: str | None = None
  
      class-attribute
      instance-attribute
  
    
weights: dict[str, WeightDetails] | None = None
  
      class-attribute
      instance-attribute
  
    
provider_name: str | None = None
  
      class-attribute
      instance-attribute
  
    
model_arch: str | None = None
  
      class-attribute
      instance-attribute
  
    
model_config_path: str | None = None
  
      class-attribute
      instance-attribute
  
    
model_config_filename: str | None = None
  
      class-attribute
      instance-attribute
  
    
visualization: VisualizationConfig = Field(default_factory=VisualizationConfig)
  
      class-attribute
      instance-attribute
  
    
ProviderConfig
    Configuration for a data provider, such as Hugging Face.
Attributes:
| Name | Type | Description | 
|---|---|---|
| target | str | The fully qualified import path to the provider's service class. | 
| dataset_url | str | The base URL for accessing datasets from this provider. | 
| api_key | str | None | An optional API key for authentication, if required. | 
Source code in culicidaelab\core\config_models.py
                
model_config = ConfigDict(extra='allow')
  
      class-attribute
      instance-attribute
  
    
target: str = Field(..., alias='target')
  
      class-attribute
      instance-attribute
  
    
dataset_url: str
  
      instance-attribute
  
    
api_key: str | None = None
  
      class-attribute
      instance-attribute
  
    
SpeciesModel
    Configuration for the entire 'species' section of the config.
Attributes:
| Name | Type | Description | 
|---|---|---|
| species_classes | dict[int, str] | A mapping of integer class IDs to string-based species names. | 
| species_metadata | SpeciesFiles | The aggregated species metadata loaded from the species directory. | 
Source code in culicidaelab\core\config_models.py
                
model_config = ConfigDict(extra='allow')
  
      class-attribute
      instance-attribute
  
    
species_classes: dict[int, str] = Field(default_factory=dict)
  
      class-attribute
      instance-attribute
  
    
species_metadata: SpeciesFiles = Field(default_factory=SpeciesFiles)
  
      class-attribute
      instance-attribute
  
    
SpeciesConfig
    A user-friendly facade for accessing and managing species configuration data.
This class implements the Facade pattern to simplify access to species-related configuration data. It provides an intuitive interface for managing species information, including class mappings, metadata, and name translations.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| config | SpeciesModel | A validated Pydantic model containing the complete species configuration data. | required | 
Attributes:
| Name | Type | Description | 
|---|---|---|
| _config | SpeciesModel | The source configuration model containing raw data. | 
| _species_map | dict[int, str] | Maps numeric class indices to full species names. | 
| _reverse_species_map | dict[str, int] | Maps full species names to their numeric indices. | 
| _metadata_store | dict | Contains detailed metadata for each species. | 
| class_to_full_name_map | dict[str, str] | Maps short class names to full scientific names. | 
| reverse_class_to_full_name_map | dict[str, str] | Maps full scientific names to short class names. | 
Example
Source code in culicidaelab\core\species_config.py
                | 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 |  | 
class_to_full_name_map = self._config.species_metadata.species_info_mapping
  
      instance-attribute
  
    
reverse_class_to_full_name_map = {v: kfor (k, v) in (self.class_to_full_name_map.items())}
  
      instance-attribute
  
    
species_map: dict[int, str]
  
      property
  
    Gets the mapping of class indices to full, human-readable species names.
Returns:
| Type | Description | 
|---|---|
| dict[int, str] | dict[int, str]: A dictionary mapping numeric class indices to full scientific species names. | 
__init__(config: SpeciesModel)
    Initializes the species configuration helper.
Sets up internal mappings and data structures for efficient species data access. Processes the input configuration to create bidirectional mappings between species names, class names, and indices.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| config | SpeciesModel | The validated species configuration model. | required | 
Source code in culicidaelab\core\species_config.py
              
get_index_by_species(species_name: str) -> int | None
    Gets the numeric class index for a given species name.
Looks up the numeric class index used by the model for a given full species name. This is useful for mapping between model predictions and species names.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| species_name | str | The full scientific name of the species (e.g., "Aedes aegypti"). | required | 
Returns:
| Type | Description | 
|---|---|
| int | None | int | None: The numeric class index used by the model, or None if the species is not found in the configuration. | 
Source code in culicidaelab\core\species_config.py
              
get_species_by_index(index: int) -> str | None
    Gets the full scientific species name for a given class index.
Converts a numeric class index used by the model into the corresponding full scientific species name. This is particularly useful when processing model predictions.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| index | int | The numeric class index used by the model. | required | 
Returns:
| Type | Description | 
|---|---|
| str | None | str | None: The full scientific species name as a string, or None if the index is not found in the configuration. | 
Source code in culicidaelab\core\species_config.py
              
get_species_label(species_name: str) -> str
    Gets the short label/class name for a given full species name.
Converts a full scientific species name to its corresponding short label used in the dataset and model classifications.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| species_name | str | The full scientific name of the species (e.g., "Aedes aegypti"). | required | 
Returns:
| Name | Type | Description | 
|---|---|---|
| str | str | The short label/class name used in the dataset (e.g., "ae_aegypti"). | 
Source code in culicidaelab\core\species_config.py
              
get_species_metadata(species_name: str) -> dict[str, Any] | None
    Gets the detailed metadata for a specific species.
Retrieves comprehensive metadata about a species, including taxonomic information, characteristics, and any custom metadata fields defined in the configuration.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| species_name | str | The full scientific name of the species (e.g., "Aedes aegypti"). | required | 
Returns:
| Type | Description | 
|---|---|
| dict[str, Any] | None | dict[str, Any] | None: A dictionary containing all metadata fields for the species, or None if the species is not found. The dictionary structure depends on the metadata fields defined in the configuration. | 
Example
Source code in culicidaelab\core\species_config.py
              
list_species_names() -> list[str]
    Returns a list of all configured full species names.
Provides a complete list of all species names that are configured in the system. The names are returned in their full scientific format.
Returns:
| Type | Description | 
|---|---|
| list[str] | list[str]: A list of full scientific species names configured in the system. | 
Example
Source code in culicidaelab\core\species_config.py
              
BoundingBox
    Represents a single bounding box with coordinates.
Attributes:
| Name | Type | Description | 
|---|---|---|
| x1 | float | The top-left x-coordinate of the bounding box. | 
| y1 | float | The top-left y-coordinate of the bounding box. | 
| x2 | float | The bottom-right x-coordinate of the bounding box. | 
| y2 | float | The bottom-right y-coordinate of the bounding box. | 
Source code in culicidaelab\core\prediction_models.py
                
x1: float = Field(..., description='Top-left x-coordinate')
  
      class-attribute
      instance-attribute
  
    
y1: float = Field(..., description='Top-left y-coordinate')
  
      class-attribute
      instance-attribute
  
    
x2: float = Field(..., description='Bottom-right x-coordinate')
  
      class-attribute
      instance-attribute
  
    
y2: float = Field(..., description='Bottom-right y-coordinate')
  
      class-attribute
      instance-attribute
  
    
to_numpy() -> np.ndarray
    Converts the bounding box to a NumPy array.
Returns:
| Type | Description | 
|---|---|
| ndarray | np.ndarray: A NumPy array of shape (4,) in the format [x1, y1, x2, y2]. | 
Detection
    Represents a single detected object, including its bounding box and confidence.
Attributes:
| Name | Type | Description | 
|---|---|---|
| box | BoundingBox | The bounding box of the detected object. | 
| confidence | float | The confidence score of the prediction, between 0.0 and 1.0. | 
Source code in culicidaelab\core\prediction_models.py
                
box: BoundingBox
  
      instance-attribute
  
    
confidence: float = Field(..., ge=0.0, le=1.0, description='Prediction confidence score')
  
      class-attribute
      instance-attribute
  
    
DetectionPrediction
    Represents the output of a detection model for a single image.
Attributes:
| Name | Type | Description | 
|---|---|---|
| detections | list[Detection] | A list of all objects detected in the image. | 
Source code in culicidaelab\core\prediction_models.py
                
              
detections: list[Detection]
  
      instance-attribute
  
    
SegmentationPrediction
    Represents the output of a segmentation model for a single image.
Attributes:
| Name | Type | Description | 
|---|---|---|
| mask | ndarray | A 2D NumPy array (H, W) representing the binary segmentation mask, where non-zero values indicate the segmented object. | 
| pixel_count | int | The total number of positive (masked) pixels in the mask. | 
Source code in culicidaelab\core\prediction_models.py
                
model_config = ConfigDict(arbitrary_types_allowed=True)
  
      class-attribute
      instance-attribute
  
    
mask: np.ndarray = Field(..., description='Binary segmentation mask as a NumPy array (H, W)')
  
      class-attribute
      instance-attribute
  
    
pixel_count: int = Field(..., description='Number of positive (masked) pixels')
  
      class-attribute
      instance-attribute
  
    
Classification
    Represents a single classification result with species name and confidence.
Attributes:
| Name | Type | Description | 
|---|---|---|
| species_name | str | The predicted species name. | 
| confidence | float | The confidence score of the prediction, between 0.0 and 1.0. | 
Source code in culicidaelab\core\prediction_models.py
                
species_name: str
  
      instance-attribute
  
    
confidence: float = Field(..., ge=0.0, le=1.0, description='Prediction confidence score')
  
      class-attribute
      instance-attribute
  
    
ClassificationPrediction
    Represents the full output of a classification model for a single image.
The predictions are typically sorted by confidence in descending order.
Attributes:
| Name | Type | Description | 
|---|---|---|
| predictions | list[Classification] | A list of classification results. | 
Source code in culicidaelab\core\prediction_models.py
                
predictions: list[Classification]
  
      instance-attribute
  
    
top_prediction() -> Classification | None
    Returns the top prediction (the one with the highest confidence).
Returns:
| Type | Description | 
|---|---|
| Classification | None | Classification | None: The top classification result, or None if there | 
| Classification | None | are no predictions. | 
Source code in culicidaelab\core\prediction_models.py
              
ProviderService
    Manages the instantiation and lifecycle of data providers.
This service acts as a factory and cache for provider instances, ensuring that each provider is a singleton within the application context.
Attributes:
| Name | Type | Description | 
|---|---|---|
| _settings | Settings | The settings instance. | 
| _providers | dict[str, BaseProvider] | A cache of instantiated providers, keyed by provider name. | 
Source code in culicidaelab\core\provider_service.py
                
__init__(settings: Settings)
    Initializes the ProviderService.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| settings | Settings | The main  | required | 
get_provider(provider_name: str) -> BaseProvider
    Retrieves an instantiated provider by its name.
It looks up the provider's configuration, instantiates it if it hasn't been already, and caches it for future calls.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| provider_name | str | The name of the provider (e.g., 'huggingface'). | required | 
Returns:
| Name | Type | Description | 
|---|---|---|
| BaseProvider | BaseProvider | An instance of a class that inherits from  | 
Raises:
| Type | Description | 
|---|---|
| ValueError | If the provider is not found in the configuration. | 
Source code in culicidaelab\core\provider_service.py
              
ResourceManager
    Centralized resource management for models, datasets, and temporary files.
This class provides thread-safe operations for managing application resources, including models, datasets, cache files, and temporary workspaces. It ensures that all file operations are handled in a consistent and safe manner.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| app_name | str | The name of the application, used for creating
dedicated directories. If not provided, it is inferred from the
 | None | 
| custom_base_dir | str | Path | A custom base directory for storing all resources. If None, system-appropriate default directories are used (e.g., AppData on Windows). Defaults to None. | None | 
Attributes:
| Name | Type | Description | 
|---|---|---|
| app_name | str | The application name. | 
| user_data_dir | Path | The root directory for user-specific data. | 
| user_cache_dir | Path | The directory for user-specific cache files. | 
| temp_dir | Path | The directory for temporary runtime files. | 
| model_dir | Path | The directory where model files are stored. | 
| dataset_dir | Path | The directory where datasets are stored. | 
| downloads_dir | Path | The directory for downloaded files. | 
| logs_dir | Path | The directory for log files. | 
| config_dir | Path | The directory for configuration files. | 
Raises:
| Type | Description | 
|---|---|
| OSError | If the resource directories cannot be created. | 
| ValueError | If the application name cannot be determined. | 
Source code in culicidaelab\core\resource_manager.py
                | 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 |  | 
app_name = self._determine_app_name(app_name)
  
      instance-attribute
  
    
__init__(app_name: str | None = None, custom_base_dir: str | Path | None = None)
    Initializes the ResourceManager with cross-platform compatibility.
Sets up the necessary directory structure for the application's resources.
Source code in culicidaelab\core\resource_manager.py
              
__repr__() -> str
    Returns a string representation of the ResourceManager instance.
Returns:
| Name | Type | Description | 
|---|---|---|
| str | str | A string representation of the object. | 
Source code in culicidaelab\core\resource_manager.py
              
            
temp_workspace(prefix: str = 'workspace', suffix: str = '')
    Provides a temporary workspace that is automatically cleaned up.
This context manager creates a temporary directory and yields its path, ensuring the directory and its contents are removed upon exiting the context, even if errors occur.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| prefix | str | A prefix for the temporary directory's name. | 'workspace' | 
| suffix | str | A suffix for the temporary directory's name. | '' | 
Yields:
| Name | Type | Description | 
|---|---|---|
| Path | The path to the temporary workspace. | 
Example
resource_manager = ResourceManager() with resource_manager.temp_workspace(prefix="job_") as ws: ... # Perform temporary operations within this workspace ... (ws / "temp_file.txt").write_text("some data") ... print(f"Workspace created at: {ws}")
The workspace directory is automatically removed here.
Source code in culicidaelab\core\resource_manager.py
              
clean_old_files(days: int = 5, include_cache: bool = True) -> dict[str, int]
    Cleans up old files from download and temporary directories.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| days | int | The age in days for a file to be considered old. | 5 | 
| include_cache | bool | If True, the cache directory is also cleaned. | True | 
Returns:
| Type | Description | 
|---|---|
| dict[str, int] | dict[str, int]: A dictionary containing statistics of the cleanup. | 
Raises:
| Type | Description | 
|---|---|
| ValueError | If  | 
Source code in culicidaelab\core\resource_manager.py
              
create_checksum(file_path: str | Path, algorithm: str = 'md5') -> str
    Creates a checksum for a given file.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| file_path | str | Path | The path to the file. | required | 
| algorithm | str | The hashing algorithm to use (e.g., 'md5', 'sha256'). | 'md5' | 
Returns:
| Name | Type | Description | 
|---|---|---|
| str | str | The hexadecimal checksum string. | 
Raises:
| Type | Description | 
|---|---|
| FileNotFoundError | If the specified file does not exist. | 
| OSError | If there is an error reading the file. | 
Source code in culicidaelab\core\resource_manager.py
              
get_all_directories() -> dict[str, Path]
    Retrieves all managed directory paths.
Returns:
| Type | Description | 
|---|---|
| dict[str, Path] | dict[str, Path]: A dictionary mapping directory names to their paths. | 
Source code in culicidaelab\core\resource_manager.py
              
get_dataset_path(dataset_name: str, create_if_missing: bool = True) -> Path
    Constructs a standardized path for a dataset.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| dataset_name | str | The name of the dataset. | required | 
| create_if_missing | bool | If True, creates the directory if it does not exist. | True | 
Returns:
| Name | Type | Description | 
|---|---|---|
| Path | Path | The absolute path to the dataset directory. | 
Raises:
| Type | Description | 
|---|---|
| ValueError | If  | 
Source code in culicidaelab\core\resource_manager.py
              
get_disk_usage() -> dict[str, dict[str, int | str]]
    Calculates disk usage for all managed directories.
Returns:
| Name | Type | Description | 
|---|---|---|
| dict | dict[str, dict[str, int | str]] | A dictionary with disk usage details for each directory, including size in bytes, human-readable size, and file count. | 
Source code in culicidaelab\core\resource_manager.py
              
verify_checksum(file_path: str | Path, expected_checksum: str, algorithm: str = 'md5') -> bool
    Verifies the checksum of a file against an expected value.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| file_path | str | Path | The path to the file. | required | 
| expected_checksum | str | The expected checksum. | required | 
| algorithm | str | The hashing algorithm used for the checksum. | 'md5' | 
Returns:
| Name | Type | Description | 
|---|---|---|
| bool | bool | True if the checksums match, False otherwise. | 
Source code in culicidaelab\core\resource_manager.py
              
Settings
    User-friendly facade for CulicidaeLab configuration management.
This class provides a simple, stable interface to access configuration values, resource directories, and application settings. All actual operations are delegated to a validated configuration object managed by ConfigManager and a ResourceManager.
The Settings class is implemented as a singleton to ensure consistent configuration state across the application. It manages: - Configuration values through get_config() and set_config() - Resource directories for models, datasets, and cache - Dataset paths and splits - Model weights paths and types - API keys for external services - Temporary workspaces for processing
Attributes:
| Name | Type | Description | 
|---|---|---|
| config | CulicidaeLabConfig | The current configuration object | 
| model_dir | Path | Directory for model weights | 
| dataset_dir | Path | Directory for datasets | 
| cache_dir | Path | Directory for cached data | 
| config_dir | Path | Active user configuration directory | 
| species_config | SpeciesConfig | Configuration for species detection | 
Source code in culicidaelab\core\settings.py
                | 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 |  | 
config: CulicidaeLabConfig = self._config_manager.get_config()
  
      instance-attribute
  
    
model_dir: Path
  
      property
  
    Model weights directory.
weights_dir: Path
  
      property
  
    Alias for model_dir.
dataset_dir: Path
  
      property
  
    Datasets directory.
cache_dir: Path
  
      property
  
    Cache directory.
config_dir: Path
  
      property
  
    The active user configuration directory.
species_config: SpeciesConfig
  
      property
  
    Species configuration (lazily loaded).
__init__(config_dir: str | Path | None = None) -> None
    Initializes the Settings facade.
This loads the configuration using a ConfigManager and sets up a ResourceManager for file paths.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| config_dir | str | Path | None | Optional path to a user-provided configuration directory. | None | 
Source code in culicidaelab\core\settings.py
              
get_config(path: str | None = None, default: Any = None) -> Any
    Gets a configuration value using a dot-separated path.
Example
settings.get_config("predictors.classifier.confidence")
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| path | str | None | A dot-separated string path to the configuration value. If None, returns the entire configuration object. | None | 
| default | Any | A default value to return if the path is not found. | None | 
Returns:
| Type | Description | 
|---|---|
| Any | The configuration value, or the default value if not found. | 
Source code in culicidaelab\core\settings.py
              
set_config(path: str, value: Any) -> None
    Sets a configuration value at a specified dot-separated path. This method can traverse both objects (Pydantic models) and dictionaries.
Note: This modifies the configuration in memory. To make it persistent,
call save_config().
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| path | str | A dot-separated string path to the configuration value. | required | 
| value | Any | The new value to set. | required | 
Source code in culicidaelab\core\settings.py
              
save_config(file_path: str | Path | None = None) -> None
    Save current configuration to a user config file. Args: file_path: Optional path to save the configuration file. If None, defaults to "culicidaelab_saved.yaml" in the user config directory.
Source code in culicidaelab\core\settings.py
              
get_cache_key_for_split(split: str | list[str] | None) -> str
    Generates a unique, deterministic hash for any valid split configuration. This hash is used to create unique directory names for dataset splits.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| split | str | list[str] | None | The split configuration to hash. Can be a single split name (e.g., 'train'), a list of splits (e.g., ['train', 'val']), or None. | required | 
Returns:
| Name | Type | Description | 
|---|---|---|
| str | str | A 16-character hexadecimal hash that uniquely identifies the split configuration. This hash is deterministic for the same input. | 
Example
settings.get_cache_key_for_split('train') 'a1b2c3d4e5f6g7h8' settings.get_cache_key_for_split(['train', 'val']) 'h8g7f6e5d4c3b2a1'
Source code in culicidaelab\core\settings.py
              
construct_split_path(dataset_base_path: Path, split: str | list[str] | None = None) -> Path
    Gets the standardized, absolute path for a dataset's directory.
This is the single source of truth for dataset path construction.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| name | str | The name of the dataset (e.g., 'classification'). | required | 
| split | str | list[str] | None | If provided, returns the specific cache path for this split configuration. Otherwise, returns the base directory for the dataset. | None | 
| ensure_exists | bool | If True, ensures the directory is created on disk. | required | 
Returns:
| Name | Type | Description | 
|---|---|---|
| Path | Path | The absolute path to the dataset directory. | 
Source code in culicidaelab\core\settings.py
              
get_dataset_path(dataset_type: str, split: str | list[str] | None = None) -> Path
    Gets the standardized path for a specific dataset directory.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| dataset_type | str | The name of the dataset type (e.g., 'classification'). | required | 
Returns:
| Type | Description | 
|---|---|
| Path | An absolute path to the dataset directory. | 
Source code in culicidaelab\core\settings.py
              
list_datasets() -> list[str]
    Get list of configured dataset types in the application.
Returns:
| Type | Description | 
|---|---|
| list[str] | list[str]: A list of dataset type identifiers that are configured in the application settings. These correspond to the different dataset categories available for training and inference. | 
Example
settings.list_datasets() ['classification', 'detection', 'segmentation']
Source code in culicidaelab\core\settings.py
              
construct_weights_path(predictor_type: str, backend: str | None = None) -> Path
    A pure, static function to construct a fully qualified model weights path.
This is the single source of truth for model path construction, creating a
structured path like: .../models/
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| model_dir | Path | The base directory for all models (e.g., '.../culicidaelab/models'). | required | 
| predictor_type | str | The type of the predictor (e.g., 'classifier'). Used as a subdirectory. | required | 
| predictor_config | PredictorConfig | The Pydantic model for the predictor's configuration. | required | 
| backend | str | None | The target backend (e.g., 'torch', 'onnx'). If None, uses the default from the config. | None | 
Returns:
| Name | Type | Description | 
|---|---|---|
| Path | Path | The absolute, structured path to the model weights file. | 
Raises:
| Type | Description | 
|---|---|
| ValueError | If a valid backend or weights filename cannot be determined. | 
Source code in culicidaelab\core\settings.py
              
get_model_weights_path(model_type: str, backend: str | None = None) -> Path
    Gets the configured path to a model's weights file.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| model_type | str | The name of the model type (e.g., 'classifier'). | required | 
Returns:
| Type | Description | 
|---|---|
| Path | The path to the model weights file. | 
Source code in culicidaelab\core\settings.py
              
list_model_types() -> list[str]
    Get list of available model types configured in the application.
Returns:
| Type | Description | 
|---|---|
| list[str] | list[str]: A list of model type identifiers (e.g., ['classifier', 'detector', 'segmenter']) that are configured in the application. These types correspond to the different predictors available in the CulicidaeLab system. | 
Example
settings.list_model_types() ['classifier', 'detector', 'segmenter']
Source code in culicidaelab\core\settings.py
              
get_api_key(provider: str) -> str | None
    Get API key for external provider from environment variables.
The method looks for environment variables in the following format: - KAGGLE_API_KEY for 'kaggle' provider - HUGGINGFACE_API_KEY for 'huggingface' provider - ROBOFLOW_API_KEY for 'roboflow' provider
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| provider | str | The name of the provider. Must be one of: 'kaggle', 'huggingface', or 'roboflow'. | required | 
Returns:
| Type | Description | 
|---|---|
| str | None | str | None: The API key if found in environment variables, None if the provider is not supported or the key is not set. | 
Example
api_key = settings.get_api_key('huggingface') if api_key: ... # Use the API key ... else: ... # Handle missing key
Source code in culicidaelab\core\settings.py
              
temp_workspace(prefix: str = 'workspace')
    Creates a temporary workspace directory that is automatically cleaned up.
This context manager creates a temporary directory for processing operations and automatically cleans it up when the context is exited.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| prefix | str | Prefix for the temporary directory name. Defaults to "workspace". | 'workspace' | 
Yields:
| Name | Type | Description | 
|---|---|---|
| Path | Path to the temporary workspace directory. | 
Example
with settings.temp_workspace(prefix='processing') as workspace: ... # Do some work in the temporary directory ... (workspace / 'output.txt').write_text('results')
Directory is automatically cleaned up after the with block
Source code in culicidaelab\core\settings.py
              
instantiate_from_config(config_path: str, **kwargs: Any) -> Any
    Instantiates an object from a configuration path.
This is a convenience method that finds a config object by its path and uses the underlying ConfigManager to instantiate it.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| config_path | str | A dot-separated path to the configuration object (e.g., "predictors.classifier"). | required | 
| **kwargs | Any | Additional keyword arguments to pass to the constructor. | {} | 
Returns:
| Type | Description | 
|---|---|
| Any | The instantiated object. | 
Source code in culicidaelab\core\settings.py
              
get_settings(config_dir: str | Path | None = None) -> Settings
    Get the Settings singleton instance.
This is the primary way to access Settings throughout the application.
If a config_dir is provided that differs from the existing instance,
a new instance will be created and returned.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| config_dir | str | Path | None | Optional path to a user-provided configuration directory. | None | 
Returns:
| Type | Description | 
|---|---|
| Settings | The Settings instance. | 
Source code in culicidaelab\core\settings.py
              
download_file(url: str, destination: str | Path | None = None, downloads_dir: str | Path | None = None, progress_callback: Callable | None = None, chunk_size: int = 8192, timeout: int = 30, desc: str | None = None) -> Path
    Downloads a file from the specified URL showing a progress bar and optionally calling a progress callback function. Supports both direct destination paths and default download directories.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| url | str | The URL of the file to download. Must start with 'http://' or 'https://'. | required | 
| destination | Union[str, Path, None] | The complete file path where the downloaded file should be saved. If None, the file will be saved in downloads_dir with its original filename. Defaults to None. | None | 
| downloads_dir | Union[str, Path, None] | The directory to save the file in when no specific destination is provided. If None, uses current working directory. Defaults to None. | None | 
| progress_callback | Optional[Callable[[int, int], None]] | A function to call with progress updates. Takes two parameters: bytes downloaded and total bytes. Defaults to None. | None | 
| chunk_size | int | Size of chunks to download in bytes. Larger chunks use more memory but may download faster. Defaults to 8192. | 8192 | 
| timeout | int | Number of seconds to wait for server response before timing out. Defaults to 30. | 30 | 
| desc | Optional[str] | Custom description for the progress bar. If None, uses the filename. Defaults to None. | None | 
Returns:
| Name | Type | Description | 
|---|---|---|
| Path | Path | Path object pointing to the downloaded file. | 
Raises:
| Type | Description | 
|---|---|
| ValueError | If the URL is invalid or doesn't start with http(s). | 
| RuntimeError | If the download fails due to network issues or if writing the file fails due to permission or disk space issues. | 
Example
from pathlib import Path
Basic download to current directory
path = download_file('https://example.com/data.csv') print(path) PosixPath('data.csv')
Download with custom progress tracking
def progress(current, total): ... print(f'Downloaded {current}/{total} bytes') path = download_file( ... 'https://example.com/large_file.zip', ... destination='downloads/myfile.zip', ... progress_callback=progress ... )
Source code in culicidaelab\core\utils.py
              | 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 |  |