src package
Subpackages
- src.pulsarsa package
- Subpackages
- Submodules
- src.pulsarsa.information_packet_formats module
- src.pulsarsa.neural_network_models module
- src.pulsarsa.pipeline_methods module
resize_image_linear()ImageReaderLabelReaderImageDataSetLabelDataSetPipelineImageToMaskPipelineImageToDelGraphtoIsPulsarPipelineImageToDelGraphtoIsPulsar.get_nns_from_pipeline()PipelineImageToDelGraphtoIsPulsar.image_to_mask_method()PipelineImageToDelGraphtoIsPulsar.mask_to_signal_method()PipelineImageToDelGraphtoIsPulsar.signal_to_label_method()PipelineImageToDelGraphtoIsPulsar.validate_efficiency()PipelineImageToDelGraphtoIsPulsar.display_results_in_batch()PipelineImageToDelGraphtoIsPulsar.test_on_real_data_from_npy_files()
PipelineImageToFilterDelGraphtoIsPulsarPipelineImageToFilterDelGraphtoIsPulsar.skip_filtering()PipelineImageToFilterDelGraphtoIsPulsar.get_nns_from_pipeline()PipelineImageToFilterDelGraphtoIsPulsar.image_to_mask_method()PipelineImageToFilterDelGraphtoIsPulsar.filter_mask_method()PipelineImageToFilterDelGraphtoIsPulsar.mask_to_signal_method()PipelineImageToFilterDelGraphtoIsPulsar.signal_to_label_method()PipelineImageToFilterDelGraphtoIsPulsar.validate_efficiency()PipelineImageToFilterDelGraphtoIsPulsar.display_results_in_batch()PipelineImageToFilterDelGraphtoIsPulsar.test_on_real_data_from_npy_files()
PipelineImageToCCtoLabelsPipelineImageToCCtoLabels.get_nns_from_pipeline()PipelineImageToCCtoLabels.image_to_mask_method()PipelineImageToCCtoLabels.mask_to_labelled_skeleton_method()PipelineImageToCCtoLabels.labelled_skeleton_to_labels_method()PipelineImageToCCtoLabels.display_results_in_batch()PipelineImageToCCtoLabels.test_on_real_data_from_npy_files()PipelineImageToCCtoLabels.measure_accuracy()PipelineImageToCCtoLabels.validate_efficiency()
PipelineImageToFilterToCCtoLabelsPipelineImageToFilterToCCtoLabels.plot()PipelineImageToFilterToCCtoLabels.remove_background()PipelineImageToFilterToCCtoLabels.skip_filtering()PipelineImageToFilterToCCtoLabels.get_nns_from_pipeline()PipelineImageToFilterToCCtoLabels.image_to_mask_method()PipelineImageToFilterToCCtoLabels.mask_to_labelled_skeleton_method()PipelineImageToFilterToCCtoLabels.filter_mask_method()PipelineImageToFilterToCCtoLabels.filter_imagemask_method()PipelineImageToFilterToCCtoLabels.analyse_signal_noise_segement_ratio()PipelineImageToFilterToCCtoLabels.labelled_skeleton_to_labels_method()PipelineImageToFilterToCCtoLabels.labelled_skeleton_to_labels_for_one_pulse_method()PipelineImageToFilterToCCtoLabels.display_results_in_batch()PipelineImageToFilterToCCtoLabels.test_on_real_data_from_npy_files()PipelineImageToFilterToCCtoLabels.measure_accuracy()PipelineImageToFilterToCCtoLabels.validate_efficiency()
PipelineCustomFilter0PipelineCustomFilter0.run_protocol_batch_compatible_track_time()PipelineCustomFilter0.run_protocol_batch_compatible()PipelineCustomFilter0.minmax_per_channel()PipelineCustomFilter0.make_full_filter_bank()PipelineCustomFilter0.make_kernel()PipelineCustomFilter0.perform_conv()PipelineCustomFilter0.perform_featuremap_reshaping()PipelineCustomFilter0.perform_sigmoid_activation()PipelineCustomFilter0.perform_segregation_metric()PipelineCustomFilter0.perform_theta_based_sum()PipelineCustomFilter0.perform_binarization()PipelineCustomFilter0.perform_boundary_to_zero()PipelineCustomFilter0.perform_theta_channel_prominence_of_peaks()PipelineCustomFilter0.perform_stack_union()PipelineCustomFilter0.gaussian_kernel1d()PipelineCustomFilter0.detect_peaks_batch_compatible()PipelineCustomFilter0.cylindrical_theta_shift()PipelineCustomFilter0.exponential_spreadbased_list()PipelineCustomFilter0.reset_theta_list()
PipelineCustomFilterPipelineCustomFilter.run_decision_tree()PipelineCustomFilter.run_protocol_batch_compatible_track_time()PipelineCustomFilter.run_protocol_batch_compatible()PipelineCustomFilter.minmax_per_channel()PipelineCustomFilter.make_dense_filterbank()PipelineCustomFilter.make_full_filter_bank()PipelineCustomFilter.make_kernel()PipelineCustomFilter.perform_conv()PipelineCustomFilter.perform_featuremap_reshaping()PipelineCustomFilter.perform_sigmoid_activation()PipelineCustomFilter.perform_segregation_metric()PipelineCustomFilter.perform_theta_based_sum()PipelineCustomFilter.perform_binarization()PipelineCustomFilter.perform_boundary_to_zero()PipelineCustomFilter.perform_theta_channel_prominence_of_peaks()PipelineCustomFilter.perform_stack_union()PipelineCustomFilter.gaussian_kernel1d()PipelineCustomFilter.detect_peaks_batch_compatible()PipelineCustomFilter.cylindrical_theta_shift()PipelineCustomFilter.exponential_spreadbased_list_depprecated()PipelineCustomFilter.exponential_spreadbased_list()PipelineCustomFilter.reset_theta_list()PipelineCustomFilter.select_theta_list()
protocol_decision_tree()
- src.pulsarsa.postprocessing module
resize_image_linear()DelayGraphLineClassifierConnectedComponentsConnectedComponents.ellipse_major_axis_protocol()ConnectedComponents.protocol2()ConnectedComponents.protocol()ConnectedComponents.filter_if_exceeds_max_components()ConnectedComponents.skeletonize_image()ConnectedComponents.detect_nodes_in_skeleton()ConnectedComponents.detect_right_angles()ConnectedComponents.divide_branches_in_skeleton()ConnectedComponents.divide_branches_in_skeleton_right_angles()ConnectedComponents.label_skeleton()ConnectedComponents.filter_out_small_components()ConnectedComponents.plot()ConnectedComponents.fit_eclipse_to_cc()ConnectedComponents.extract_cx_cy()ConnectedComponents.extract_xs_ys()ConnectedComponents.line_normal_form()ConnectedComponents.collect_line_params_from_coors()ConnectedComponents.label_binary_image()ConnectedComponents.plot_lines_on_cc()ConnectedComponents.collect_line_params_from_ccs()ConnectedComponents.cluster_based_on_features()ConnectedComponents.plot_clusters_from_ccs()ConnectedComponents.plot_lines_of_cluster()ConnectedComponents.detect_lines_from_cluster_based_on_snr()ConnectedComponents.calculate_line_points_from_intercepts()ConnectedComponents.compute_line_snr()
bresenham_line()align_and_pad_signals()compute_line_intensity_profile()find_rowwise_centers_with_box_regularized()snr_from_sum_profile()align_profiles_to_positions_with_scaling()align_profiles_to_positions()compute_line_snr_simple()FitSegmentedTracesFitSegmentedTraces.protocol()FitSegmentedTraces.extract_coors()FitSegmentedTraces.categorize_trace()FitSegmentedTraces.fit_linear_curve()FitSegmentedTraces.plot()FitSegmentedTraces.fitt_to_all_traces()FitSegmentedTraces.plot_all_traces()FitSegmentedTraces.return_detected_categories()FitSegmentedTraces.plot_all_traces_with_categories()
- src.pulsarsa.preprocessing module
- src.pulsarsa.train_neural_network_model module
- src.pulsarsa.tune_parameters module
TunerPCATunableParameterExtractorTunableParameterExtractor.pull_nns_from_pipeline()TunableParameterExtractor.pull_type()TunableParameterExtractor.pull_parameters()TunableParameterExtractor.push_parameters()TunableParameterExtractor.pull_learnable_params_from_torch_module()TunableParameterExtractor.push_learnable_params_to_torch_module()TunableParameterExtractor.pull_learnable_params_from_pipeline()TunableParameterExtractor.push_learnable_params_to_pipeline()
Tuner
- src.pulsarsa.tuner_optimization module
- Module contents
PayloadPrepareFreqTimeImageImageReaderImageDataSetLabelReaderLabelDataSetImageToMaskDatasetInMaskToMaskDatasetImageInMaskToMaskDatasetTrainImageToMaskNetworkModelTrainSignalToLabelModelUNetFilterCNNPipelineImageToCCtoLabelsPipelineImageToCCtoLabels.get_nns_from_pipeline()PipelineImageToCCtoLabels.image_to_mask_method()PipelineImageToCCtoLabels.mask_to_labelled_skeleton_method()PipelineImageToCCtoLabels.labelled_skeleton_to_labels_method()PipelineImageToCCtoLabels.display_results_in_batch()PipelineImageToCCtoLabels.test_on_real_data_from_npy_files()PipelineImageToCCtoLabels.measure_accuracy()PipelineImageToCCtoLabels.validate_efficiency()
PipelineImageToDelGraphtoIsPulsarPipelineImageToDelGraphtoIsPulsar.get_nns_from_pipeline()PipelineImageToDelGraphtoIsPulsar.image_to_mask_method()PipelineImageToDelGraphtoIsPulsar.mask_to_signal_method()PipelineImageToDelGraphtoIsPulsar.signal_to_label_method()PipelineImageToDelGraphtoIsPulsar.validate_efficiency()PipelineImageToDelGraphtoIsPulsar.display_results_in_batch()PipelineImageToDelGraphtoIsPulsar.test_on_real_data_from_npy_files()
PipelineImageToFilterDelGraphtoIsPulsarPipelineImageToFilterDelGraphtoIsPulsar.skip_filtering()PipelineImageToFilterDelGraphtoIsPulsar.get_nns_from_pipeline()PipelineImageToFilterDelGraphtoIsPulsar.image_to_mask_method()PipelineImageToFilterDelGraphtoIsPulsar.filter_mask_method()PipelineImageToFilterDelGraphtoIsPulsar.mask_to_signal_method()PipelineImageToFilterDelGraphtoIsPulsar.signal_to_label_method()PipelineImageToFilterDelGraphtoIsPulsar.validate_efficiency()PipelineImageToFilterDelGraphtoIsPulsar.display_results_in_batch()PipelineImageToFilterDelGraphtoIsPulsar.test_on_real_data_from_npy_files()
PipelineImageToFilterToCCtoLabelsPipelineImageToFilterToCCtoLabels.plot()PipelineImageToFilterToCCtoLabels.remove_background()PipelineImageToFilterToCCtoLabels.skip_filtering()PipelineImageToFilterToCCtoLabels.get_nns_from_pipeline()PipelineImageToFilterToCCtoLabels.image_to_mask_method()PipelineImageToFilterToCCtoLabels.mask_to_labelled_skeleton_method()PipelineImageToFilterToCCtoLabels.filter_mask_method()PipelineImageToFilterToCCtoLabels.filter_imagemask_method()PipelineImageToFilterToCCtoLabels.analyse_signal_noise_segement_ratio()PipelineImageToFilterToCCtoLabels.labelled_skeleton_to_labels_method()PipelineImageToFilterToCCtoLabels.labelled_skeleton_to_labels_for_one_pulse_method()PipelineImageToFilterToCCtoLabels.display_results_in_batch()PipelineImageToFilterToCCtoLabels.test_on_real_data_from_npy_files()PipelineImageToFilterToCCtoLabels.measure_accuracy()PipelineImageToFilterToCCtoLabels.validate_efficiency()
PipelineImageToMaskTunerPipelineTunerLossFunctionfind_the_best_mixed_pipeline()save_best_mixed_model()plot_loss_in_2d_in_pairwise_parameter_combi()make_best_mixed_model_from_params()
Module contents
- class src.Payload(freqs: list[float], bandwidths: list[float] | None = None)
Bases:
objectThis class is used to store radio packet data.
- plot(type: str = 'img')
Plots the dataframe
- Parameters:
type (str, optional) – _description_. Defaults to ‘img’.
- Returns:
plt.axes – returns the axes of the plot
- add_flux(radio_packet: list[list[float]])
This function adds the flux to the dataframe. The flux is a list of lists, where each list is a row of the dataframe. The first element of the list is the flux and the second element is the frequency. The function checks if the frequencies of the radio packet match the frequencies of the payload. If they do, it adds the flux to the dataframe. If they don’t, it raises an error.
- Parameters:
radio_packet (list[list[float]]) – list of lists, where each list is a row of the dataframe. The first element of the list is the flux and the second element is the frequency.
- add_description(description: dict)
This function adds a description to the payload. The description is a dictionary with the keys ‘Pulsar’, ‘NBRFI’, ‘BBRFI’. The values of the dictionary are the descriptions of the pulsar, NBRFI and BBRFI respectively.
- Parameters:
description (dict) – dictionary with the keys ‘Pulsar’, ‘NBRFI’, ‘BBRFI’. The values of the dictionary are the number of the pulsar, NBRFI and BBRFI respectively.
- assign_bandwidths_to_freqchannels(bandwidths: list[float])
This function assigns bandwidths to the freqs
- Parameters:
bandwidths (list[float]) – list of bandwidths.
Note
The bandwidths are assigned to the frequencies in the same order as the frequencies. The bandwidths list length must match the length of the frequencies.
- assign_rot_phases(rot_phases: list[float])
This function assigns the rotation phases to the dataframe. The rotation phases are a list of floats
- Parameters:
rot_phases (list[float]) – _description_
- write_payload_to_jsonfile(file_name: str)
This method writes the payload to a json file.
- Parameters:
file_name (str) – path + name of the file to write to
- classmethod read_payload_from_jsonfile(filename: str)
This method reads the payload from a json file.
- Parameters:
filename (str) – path + name of the file to read from
- Returns:
Payload – returns the payload object
- class src.PrepareFreqTimeImage(do_rot_phase_avg: bool = True, do_resize: bool = True, do_binarize: bool = False, resize_size: tuple = (256, 256), binarize_engine: ~src.pulsarsa.preprocessing.BinarizeToMask = <src.pulsarsa.preprocessing.BinarizeToMask object>)
Bases:
objectClass to implement methods to load and pre process radio payloads to freq-time image
- preparation_protocol(payload: Payload)
Protocol to call methods to prepare freq-time graphs
- Parameters:
payload (Payload) – Payload class object made during the simulation
- Returns:
image (np.ndarray) – freq-time image
- plot(payload_address: str)
plot the freq-time graph loaded from payload file
- Parameters:
payload_address (str) – path to the payload file
- class src.ImageReader(file_type: type = <src.pulsarsa.information_packet_formats.Payload object>, resize_size: tuple = (256, 256), do_average: bool = False, do_binarize: bool = False)
Bases:
objectClass ImageReader acts as an engine to load/prepare images from payload files or numpy arrays
- static read_from_payload(filename: str, resize_size: tuple = (128, 128), do_average: bool = False, do_binarize: bool = False)
Method to load freq-time image from payload files
- Parameters:
filename (str) – full address to payload file
resize_size (tuple, optional) – output shape of the loaded image. Defaults to (128, 128).
do_average (bool, optional) – If True then the image is created by averaging phase values over many rotations. Defaults to False.
do_binarize (bool, optional) – If True then the image is binarized. Defaults to False.
- Returns:
(np.ndarray) – loaded image
- class src.ImageDataSet(image_tag: str, image_directory: str, image_reader_engine: ~src.pulsarsa.pipeline_methods.ImageReader | ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = <src.pulsarsa.pipeline_methods.ImageReader object>)
Bases:
objectThis class is used to represent or memory map a set of images
- plot(idx)
plots the image from the set represented by the idx
- Parameters:
idx (int) – index of the image
- Returns:
plt.axis – axis of the plot
- class src.LabelReader(file_type: type = <src.pulsarsa.information_packet_formats.Payload object>)
Bases:
objectThis class acts as an engine to read the labels from files of type payload
- static read_from_payload(filename: str)
Method to read the label from pyload file
- Parameters:
filename (str) – full path to the payload file
- Returns:
dict – dictionary containing details of the payload file
- static read_from_str(label_memomory_map: str)
Method to read the label from numpy file
- static correct_key_names(description: dict)
- class src.LabelDataSet(image_tag: str, image_directory: str, label_reader_engine: ~src.pulsarsa.pipeline_methods.LabelReader = <src.pulsarsa.pipeline_methods.LabelReader object>)
Bases:
objectThis class is used to represent or memory map a set of labels of the images
- plot(idx)
prints the label of idx image
- Parameters:
idx (int) – index representing the image
- Returns:
dict – description
- class src.ImageToMaskDataset(image_tag: str, mask_tag: str, image_directory: str, mask_directory: str, image_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': False, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>}, mask_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': True, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>}, device: ~torch.device = device(type='cpu'))
Bases:
DatasetClass to represent Image Mask dataset
- plot(index)
Plot image mask pair represented by index
- Parameters:
index (int) – index of the pair
- class src.InMaskToMaskDataset(image_tag: str, mask_tag: str, image_directory: str, mask_directory: str, mask_maker_engine: ~src.pulsarsa.pipeline_methods.PipelineImageToMask, image_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': False, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>}, mask_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': True, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>}, device: ~torch.device = device(type='cpu'))
Bases:
DatasetClass to represent InMask Mask dataset
- plot(index)
Plot InMask and Mask pair
- Parameters:
index (int) – index of the pair to plot
- class src.ImageInMaskToMaskDataset(image_tag: str, mask_tag: str, image_directory: str, mask_directory: str, mask_maker_engine: ~src.pulsarsa.pipeline_methods.PipelineImageToMask, image_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': False, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>}, mask_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': True, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>}, device: ~torch.device = device(type='cpu'))
Bases:
DatasetClass to represent Image+InMask Mask dataset
- plot(index)
Plot the 2-channel input (image + in_mask) and the target mask.
- class src.TrainImageToMaskNetworkModel(num_epochs: int, loss_criterion: Module = CustomLossUNet(), store_trained_model_at: str = './syn_data/model/trained_unet_test_v0.pt', model: Module | None = None, train_test_split: float = 0.8, learning_rate: float = 0.001, batch_size: int = 10)
Bases:
objectClass involving methods to train Image (or InMask) to Mask Network
- train_model(image_mask_pairset: ImageToMaskDataset | InMaskToMaskDataset | ImageInMaskToMaskDataset, early_stopping_patience: int = 5, plot: bool = False, plot_path: str | None = None, label_dataset: LabelDataSet | None = None, preferred_label: str | None = None, plot_validation_samples_at: str | None = None, ml_flow_folder: str | None = None, ml_flow_exp_name: str | None = None)
Train the network with validation, early stopping, and optional plotting.
- Parameters:
image_mask_pairset (ImageToMaskDataset|InMaskToMaskDataset|ImageInMaskToMaskDataset) – Dataset containing image-mask pairs.
early_stopping_patience (int) – Epochs to wait after no val loss improvement.
plot (bool) – Whether to show a loss curve plot. Default is False.
- Returns:
(list, list, list) – epochs, training losses, validation losses
- test_model(image: tensor, plot_pred: bool = False)
Method to test the network
- Parameters:
image (torch.tensor) – image to convert to mask
plot_pred (bool, optional) – If True, plots the prediction from the NN. Defaults to False.
- Returns:
(np.ndarray) – prediction by the network as predicted mask
- class src.TrainSignalToLabelModel(num_epochs: int, loss_criterion: Module = BCELoss(), store_trained_model_at: str = './syn_data/model/trained_OneDconvEncoder_test_v0.pt', model: Module | None = None)
Bases:
objectClass involving methods to train nn to classify signal into labels
- train_model(signal_label_pairset: SignalToLabelDataset)
Method to train the NN
- Parameters:
signal_label_pairset (SignalToLabelDataset) – signal label pair dataset
- Returns:
(list,list) – epoch number and the loss in each epoch
- test_model_from_signal(signal: tensor, plot_pred: bool = False)
Method to test the nn to classify a signal
- Parameters:
signal (torch.tensor) – signal to classify
plot_pred (bool, optional) – If True, plots the signal and with prediction probability of each categories. Defaults to False.
- Returns:
np.ndarray – Predicted labels are with probabilities
- test_model(mask: ndarray, plot_pred: bool = False)
Method to test model from mask
- Parameters:
mask (np.ndarray) – mask from which category probability is predicted after generating the signal
plot_pred (bool, optional) – If True plots the results. Defaults to False.
- Returns:
np.ndarray – Predicted labels are with probabilities
- class src.UNet(in_channels=1, out_channels=1, init_features=32)
Bases:
ModuleThis model is taken from the original paper of UNet available in PyTorch website. Some modification is done in terms of input and output channels and kernel size.
- Parameters:
nn (_type_) – _description_
- class src.FilterCNN(in_channels=1, out_channels=1, init_features=32)
Bases:
ModuleThis is a simple encoder-decoder architecture for filtering the segmented pulsar signals.
- class src.PipelineImageToCCtoLabels(image_to_mask_network: Module, trained_image_to_mask_network_path: str, min_cc_size_threshold: int = 10)
Bases:
objectClass implementing methods in sequence to generate segmented freq-time Image, then CC then to determine CCs to categories
- get_nns_from_pipeline()
This method returns the neural networks used in the pipeline
- Returns:
list – list of neural networks
- image_to_mask_method(image: ndarray)
Method to convert image to mask
- Parameters:
image (np.ndarray) – image
- Returns:
image (np.ndarray) – mask
- mask_to_labelled_skeleton_method(mask: ndarray)
Method to make labelled skeleton from mask
- Parameters:
mask (np.ndarray) – segmented mask
- Returns:
(np.ndarray) – labelled skeleton
- labelled_skeleton_to_labels_method(labelled_skeleton: ndarray, return_detailed_results: bool = False)
Method to analyze each labels of the labelled skeleton
- Parameters:
labelled_skeleton (np.ndarray) – labelled skeleton
return_detailed_results (bool, optional) – _description_. Defaults to False.
- Returns:
results in list of each labels
- display_results_in_batch(image_data_set: ImageDataSet, mask_data_set: ImageDataSet, label_data_set: LabelDataSet, randomize: bool = True, ids_toshow: list = [0, 1], batch_size: int = 2)
Plot results of the pipeline with step outputs and comparison with pre-labelled dataset
- Parameters:
image_data_set (ImageDataSet) – Image dataset
label_data_set (LabelDataSet) – Label dataset of the images
mask_data_set (ImageDataSet) – Mask dataset of the image dataset
randomize (bool, optional) – If True, randomly chooses images from the dataset. Defaults to True.
ids_toshow (list, optional) – If radomize = False, then choose ids_show from dataset. Defaults to [0, 1].
batch_size (int, optional) – If randomize=True, then chooses batch_size images from set. Defaults to 2.
- test_on_real_data_from_npy_files(image_data_set: memmap, image_label_set: memmap | None = None, plot_randomly: bool = True, batch_size: int = 5)
Method to test pipeline on .npy file dataset
- Parameters:
image_data_set (np.memmap) – image dataset as numpy array
image_label_set (np.memmap | None, optional) – label dataset as numpy array. Defaults to None.
plot_details (bool, optional) – if True then plot the results. Defaults to False.
plot_randomly (bool, optional) – If True then randomly choose images from dataset. Defaults to True.
batch_size (int, optional) – number of images to test. minimum is 2. Defaults to 5.
- measure_accuracy(image_data_set: memmap, label_data_set: memmap, plot_results: bool = False, return_specific_key: str | None = None)
- validate_efficiency(image_data_set: ImageDataSet, label_data_set: LabelDataSet, plot_results: bool = True, return_specific_key: str | None = None)
Method to validate efficiency of the pipeline from image and label dataset
- Parameters:
image_data_set (ImageDataSet) – image dataset
label_data_set (LabelDataSet) – label dataset
plot_results (bool, optional) – If True then plot the results. Defaults to True.
return_specific_key (str | None, optional) – If provided, returns only the results for this specific key. Defaults to None.
- Returns:
tuple – true_scores,true_negative_scores,false_scores,signal_present,f1_scores, precision, recall, accuracy
- class src.PipelineImageToDelGraphtoIsPulsar(image_to_mask_network: Module, trained_image_to_mask_network_path: str, signal_to_label_network: Module, trained_signal_to_label_network: str)
Bases:
objectClass implementing methods in sequence to generate segmented freq-time Image then Delay graph then to determine if pulsar is there
- get_nns_from_pipeline()
- image_to_mask_method(image: ndarray)
Method to convert image to mask
- Parameters:
image (np.ndarray) – image
- Returns:
image (np.ndarray) – mask
- mask_to_signal_method(mask)
Method to convert mask to delaygraph and extract lags as signal
- Parameters:
image (np.ndarray) – mask
- Returns:
np.ndarray – x_lags as signal
- signal_to_label_method(signal: ndarray)
Method to determine if pulsar is there based on signal
- Parameters:
signal (np.ndarray) – x_lags as signal
- Returns:
float – probability if pulsar is there
- validate_efficiency(image_data_set: ImageDataSet, label_data_set: LabelDataSet)
Method to validate efficiency of the pipeline from image and label dataset
- Parameters:
image_data_set (ImageDataSet) – image dataset
label_data_set (LabelDataSet) – label dataset
- Returns:
float – efficiency measure
- display_results_in_batch(image_data_set: ImageDataSet, mask_data_set: ImageDataSet, label_data_set: LabelDataSet, randomize: bool = True, ids_toshow: list = [0, 1], batch_size: int = 2)
Plot results of the pipeline with step outputs and comparison with pre-labelled dataset
- Parameters:
image_data_set (ImageDataSet) – Image dataset
mask_data_set (ImageDataSet) – Mask dataset of the image dataset
label_data_set (LabelDataSet) – Label dataset of the images
randomize (bool, optional) – If True, randomly chooses images from the dataset. Defaults to True.
ids_toshow (list, optional) – If radomize = False, then choose ids_show from dataset. Defaults to [0, 1].
batch_size (int, optional) – If randomize=True, then chooses batch_size images from set. Defaults to 2.
- test_on_real_data_from_npy_files(image_data_set: memmap, image_label_set: memmap | None = None, plot_details: bool = False, plot_randomly: bool = True, batch_size: int = 5)
Method to test pipeline on .npy file dataset
- Parameters:
image_data_set (np.memmap) – image dataset as numpy array
image_label_set (np.memmap | None, optional) – label dataset as numpy array. Defaults to None.
plot_details (bool, optional) – if True then plot the results. Defaults to False.
plot_randomly (bool, optional) – If True then randomly choose images from dataset. Defaults to True.
batch_size (int, optional) – number of images to test. minimum is 2. Defaults to 5.
- class src.PipelineImageToFilterDelGraphtoIsPulsar(image_to_mask_network: Module, trained_image_to_mask_network_path: str, mask_filter_network: Module, trained_mask_filter_network_path: str, signal_to_label_network: Module, trained_signal_to_label_network: str, skip_filter: bool = False)
Bases:
objectClass implementing methods in sequence to generate segmented freq-time Image, filter it, then Delay graph then to determine if pulsar is there
- skip_filtering(skip_filter: bool = False)
Method to skip filter step
- Parameters:
skip_filter (bool, optional) – If True then skip filter step. Defaults to False.
- get_nns_from_pipeline()
- image_to_mask_method(image: ndarray)
Method to convert image to mask
- Parameters:
image (np.ndarray) – image
- Returns:
image (np.ndarray) – mask
- filter_mask_method(pred_binarized: ndarray)
Method to filter out wrong segments in the segmented mask
- Parameters:
pred_binarized (np.ndarray) – segmented mask to filter
- Returns:
np.ndarray – filtered segmented mask
- mask_to_signal_method(mask)
Method to convert mask to delaygraph and extract lags as signal
- Parameters:
image (np.ndarray) – mask
- Returns:
np.ndarray – x_lags as signal
- signal_to_label_method(signal: ndarray)
Method to determine if pulsar is there based on signal
- Parameters:
signal (np.ndarray) – x_lags as signal
- Returns:
float – probability if pulsar is there
- validate_efficiency(image_data_set: ImageDataSet, label_data_set: LabelDataSet)
Method to validate efficiency of the pipeline from image and label dataset
- Parameters:
image_data_set (ImageDataSet) – image dataset
label_data_set (LabelDataSet) – label dataset
- Returns:
float – efficiency measure
- display_results_in_batch(image_data_set: ImageDataSet, mask_data_set: ImageDataSet, label_data_set: LabelDataSet, randomize: bool = True, ids_toshow: list = [0, 1], batch_size: int = 2)
Plot results of the pipeline with step outputs and comparison with pre-labelled dataset
- Parameters:
image_data_set (ImageDataSet) – Image dataset
mask_data_set (ImageDataSet) – Mask dataset of the image dataset
label_data_set (LabelDataSet) – Label dataset of the images
randomize (bool, optional) – If True, randomly chooses images from the dataset. Defaults to True.
ids_toshow (list, optional) – If radomize = False, then choose ids_show from dataset. Defaults to [0, 1].
batch_size (int, optional) – If randomize=True, then chooses batch_size images from set. Defaults to 2.
- test_on_real_data_from_npy_files(image_data_set: memmap, image_label_set: memmap | None = None, plot_details: bool = False, plot_randomly: bool = True, batch_size: int = 5)
Method to test pipeline on .npy file dataset
- Parameters:
image_data_set (np.memmap) – image dataset as numpy array
image_label_set (np.memmap | None, optional) – label dataset as numpy array. Defaults to None.
plot_details (bool, optional) – if True then plot the results. Defaults to False.
plot_randomly (bool, optional) – If True then randomly choose images from dataset. Defaults to True.
batch_size (int, optional) – number of images to test. minimum is 2. Defaults to 5.
- class src.PipelineImageToFilterToCCtoLabels(image_to_mask_network: Module, trained_image_to_mask_network_path: str, mask_filter_network: Module, trained_mask_filter_network_path: str, min_cc_size_threshold: int = 10, skip_filter: bool = False, min_axis_ratio: float = 4.0, allow_image_as_input_to_filter: bool = False, remove_backgound: bool = True, one_pulse_mode: bool = True, return_snr: bool = False, box_func_window=5, snr_thresh=2, corr_thresh=5)
Bases:
objectClass implementing methods in sequence to generate a segmented frequency-time image, filter it, perform connected components (CC) analysis, and categorize the components.
- The processing pipeline consists of the following steps:
Image-to-mask conversion using a neural network.
Mask filtering using a neural network.
Mask-to-labeled skeleton conversion using connected components.
Labeled skeleton to category/label identification.
- Parameters:
image_to_mask_network (nn.Module) – Neural network to convert image to mask.
trained_image_to_mask_network_path (str) – Path to trained weights of the image-to-mask network.
mask_filter_network (nn.Module) – Neural network to filter mask.
trained_mask_filter_network_path (str) – Path to trained weights of the mask filter network.
min_cc_size_threshold (int) – Minimum size of a connected component to be considered valid.
skip_filter (bool) – If True, skip the filtering step.
min_axis_ratio (float) – Minimum axis ratio of a connected component to be considered valid.
allow_image_as_input_to_filter (bool) – If True, allow the image as input to the filter network.
remove_background (bool) – If True, remove background from the image before processing.
one_pulse_mode (bool) – If True, use one-pulse mode. Clustering and regularization select the best cluster of CCs, which is then identified as a single category.
return_snr (bool) – If True, return SNR in the results.
box_func_window (int) – Window size for boxcar function to align pulses in each channel on the lines from each cluster.
snr_thresh (float) – Threshold for SNR to consider a cluster as a valid pulse.
corr_thresh (float) – Threshold for correlation to identify a valid pulse component in a channel for regularization.
- __call__(image
np.ndarray, return_steps: bool = False): Runs the pipeline on the given image.
- plot(image: ndarray, return_steps: bool = False)
Method to plot the results of the pipeline :param image: input image :type image: np.ndarray :param return_steps: If True, return intermediate steps. Defaults to False. :type return_steps: bool, optional
- Returns:
matplotlib.axes.Axes – Axes object with the plots
- remove_background(image: ndarray) ndarray
Method to remove background from the image using gaussian filtering for gradient type background
- Parameters:
image (np.ndarray) – input image
- Returns:
np.ndarray – image with background removed
- skip_filtering(skip_filter: bool = True)
Method to skip filtering step in the pipeline
- Parameters:
skip_filter (bool, optional) – If True, then skip filtering step. Defaults to True.
- get_nns_from_pipeline()
This method returns the neural networks used in the pipeline
- Returns:
list – list of neural networks used in the pipeline
- image_to_mask_method(image: ndarray)
Method to convert image to mask
- Parameters:
image (np.ndarray) – image
- Returns:
image (np.ndarray) – mask
- mask_to_labelled_skeleton_method(mask: ndarray)
Method to make labelled skeleton from mask
- Parameters:
mask (np.ndarray) – segmented mask
- Returns:
(np.ndarray) – labelled skeleton
- filter_mask_method(pred_binarized: ndarray)
Method to filter out wrong segments in the segmented mask
- Parameters:
pred_binarized (np.ndarray) – segmented mask to filter
- Returns:
np.ndarray – filtered segmented mask
- filter_imagemask_method(image_pred_binarized: ndarray) ndarray
Filter out wrong segments in the segmented mask.
- Parameters:
image_pred_binarized (np.ndarray) – input with 2 channels (image + in_mask), shape (H, W, 2)
- Returns:
np.ndarray – filtered and binarized segmented mask, shape (H, W)
- analyse_signal_noise_segement_ratio(binary_filtered_mask: ndarray, thresh: float = 0.3)
Method to analyze the signal to noise segment ratio in the filtered mask :param binary_filtered_mask: filtered mask :type binary_filtered_mask: np.ndarray :param thresh: minimum signal to noise segment ratio to consider a segment as valid. Defaults to 0.5. :type thresh: float, optional
- Returns:
bool – True if the signal to noise segment ratio is less then threshold, False otherwise
- labelled_skeleton_to_labels_method(labelled_skeleton: ndarray, return_detailed_results: bool = False)
Method to analyze each labels of the labelled skeleton
- Parameters:
labelled_skeleton (np.ndarray) – labelled skeleton
return_detailed_results (bool, optional) – _description_. Defaults to False.
- Returns:
results in list of each labels
- labelled_skeleton_to_labels_for_one_pulse_method(labelled_skeleton: ndarray, real_image: ndarray, return_detailed_results: bool = False, return_snr: bool = False, snr_thresh=2, box_func_window=5, signal_length=20, top_n=5, corr_thresh=5)
- display_results_in_batch(image_data_set: ImageDataSet, mask_data_set: ImageDataSet, label_data_set: LabelDataSet, randomize: bool = True, ids_toshow: list = [0, 1], batch_size: int = 2)
Plot results of the pipeline with step outputs and comparison with pre-labelled dataset
- Parameters:
image_data_set (ImageDataSet) – Image dataset
label_data_set (LabelDataSet) – Label dataset of the images
randomize (bool, optional) – If True, randomly chooses images from the dataset. Defaults to True.
ids_toshow (list, optional) – If radomize = False, then choose ids_show from dataset. Defaults to [0, 1].
batch_size (int, optional) – If randomize=True, then chooses batch_size images from set. Defaults to 2.
- test_on_real_data_from_npy_files(image_data_set: memmap, image_label_set: memmap | None = None, plot_randomly: bool = True, batch_size: int = 5, save_plot_path: str | None = None)
Method to test pipeline on .npy file dataset
- Parameters:
image_data_set (np.memmap) – image dataset as numpy array
image_label_set (np.memmap | None, optional) – label dataset as numpy array. Defaults to None.
plot_details (bool, optional) – if True then plot the results. Defaults to False.
plot_randomly (bool, optional) – If True then randomly choose images from dataset. Defaults to True.
batch_size (int, optional) – number of images to test. minimum is 2. Defaults to 5.
save_plot_path (str, optional) – If provided saves the plot in that location. Defaults to None
- measure_accuracy(image_data_set: memmap, label_data_set: memmap, plot_results: bool = False, return_specific_key: str | None = None)
- validate_efficiency(image_data_set: ImageDataSet, label_data_set: LabelDataSet, plot_results: bool = True, return_specific_key: str | None = None)
Method to validate efficiency of the pipeline from image and label dataset
- Parameters:
image_data_set (ImageDataSet) – image dataset
label_data_set (LabelDataSet) – label dataset
plot_results (bool, optional) – If True then plot the results. Defaults to True.
return_specific_key (str | None, optional) – If provided, returns only the results for this specific key. Defaults to None.
- Returns:
tuple – true_scores,true_negative_scores,false_scores,signal_present,f1_scores, precision, recall, accuracy
- class src.PipelineImageToMask(image_to_mask_network: Module, trained_image_to_mask_network_path: str)
Bases:
objectClass implementing methods in sequence to generate segmented freq-time Image from freq-time Image
- image_to_mask_method(image: ndarray)
Method to convert image to mask
- Parameters:
image (np.ndarray) – image
- Returns:
image (np.ndarray) – mask
- plot(image: ndarray)
plots the image from the set represented by the idx
- Parameters:
image (np.ndarray) – image
- Returns:
plt.axis – axis of the plot
- class src.Tuner(sample_of_objects: list[Module | PipelineImageToCCtoLabels | PipelineImageToFilterToCCtoLabels | PipelineImageToFilterDelGraphtoIsPulsar | PipelineImageToDelGraphtoIsPulsar], show_steps: bool = True, reset_components: bool = True, variance_to_capture: float | int = 0.95, all_sliders: bool = False)
Bases:
objectThis class is used to tune the parameters of a neural network or a pipeline. It uses PCA to reduce the dimensionality of the parameters and then allows the user to modify the PCA components to generate new data points. This feature is in beta mode and subject to upgrade/change
- Parameters:
sample_of_objects (list) – A list of objects that are either nn.Module or PipelineImageToCCtoLabels or PipelineImageToFilterToCCtoLabels or PipelineImageToFilterDelGraphtoIsPulsar or PipelineImageToDelGraphtoIsPulsar.
show_steps (bool, optional) – If True, it will print the steps of the tuning process. Defaults to True.
reset_components (bool, optional) – If True, it will reset the PCA components to the average scaled PCA factors. Defaults to True.
variance_to_capture (float|int, optional) – The variance to capture in PCA. Defaults to 0.95.
all_sliders (bool, optional) – If True, it will generate sliders for all PCA components. Defaults to False.
When a tuner instance is called with a folder_to_save argument, it will generate a mixed model and save the state_dict of the neural networks in the pipeline to the specified folder, and return the mixed model object.
- generate_mixed_model_from_current_components()
Generates a mixed model obj from the current PCA components and returns it.
- get_sample_pca_components()
Returns the sample PCA components used for tuning.
- get_current_scaled_pca_factors()
Returns the current scaled PCA factors.
- set_scaled_pca_factors(scaled_pca_factors: ndarray)
Sets the scaled PCA factors to the given values.
- generate_mixed_model_from_input_pca_factors(pca_factors: ndarray)
- get_component_ranges()
- class src.PipelineTunerLossFunction(tuner: Tuner, image_dataset, label_dataset, sample_size, weights, save_model: bool = False, save_path: str = './syn_data/model/')
Bases:
objectThis class deals with calculating the loss function of the mixed model generated by mixing PCA components. In its current state it takes a Tuner object, image dataset, label dataset, sample size and weights for the loss function. For creating an instance of this class, you need to pass the Tuner object, image dataset, label dataset, sample size and weights.
- Parameters:
tuner (Tuner) – The Tuner object that contains the PCA engine and other parameters.
image_dataset (np.ndarray) – numpy array of 2d images to be used for testing the mixed model.
label_dataset (np.ndarray) – numpy array of labels corresponding to the images For example [‘Pulsar+NBRFI’,’Pulsar’].
sample_size (int) – Number of samples to randomly select from the dataset for calculating loss.
weights (list) – Weights to be applied to the loss function for each label component.
save_model (bool, optional) – Whether to save the best mixed model or not. Defaults to False.
save_path (str, optional) – Path to save the best mixed model if save_model is True. Defaults to ‘./syn_data/model/’.
Note
The Tuner object should have been initialized with the PCA engine and the number of components to be used for mixing.
If the sample size is smaller than the number of images in the dataset, it will randomly select samples from the dataset which will lead to noise in the loss function. On the other hand, if the sample size is close to the number of images in the dataset, it will lead to a more stable loss function but in case of big datasets it will take a lot of time to calculate the loss function.
- catalogue_data_set_containing_pulsar_nbrfi_bbrfi_none()
This function reads the input label_numpy set and catague the index containing pulsar,NBRFI,BBRFI,None
- create_normalized_ori_labels(idx: list)
- create_normalized_pred_labels(predictor, idx: list)
- src.find_the_best_mixed_pipeline(f: PipelineTunerLossFunction, n_calls=200, n_initial_points=10, minimizer: str = 'gp_minimize', random_seed: int | None = None, niter=1, min_minima: bool = True, save_model: bool = False, save_path: str = './syn_data/model/')
This function finds the best mixed model by optimizing the PCA components using a loss function. It uses the skopt library to minimize the loss function by varying the PCA components. The function returns the best mixed model and the results of the optimization.
- Parameters:
f (PipelineTunerLossFunction) – An instance of the PipelineTunerLossFunction class that contains the loss function to be minimized.
n_calls (int, optional) – Number of calls to the loss function during optimization. Defaults to 200.
n_initial_points (int, optional) – Number of initial points to sample before optimization apart from the x0 points. Defaults to 10.
minimizer (str, optional) – Minimization method to use. Options are ‘gp_minimize’, ‘forest_minimize’. Defaults to ‘gp_minimize’.
random_seed (int, optional) – Random seed for reproducibility. Defaults to None.
niter (int, optional) – Number of iterations to run the optimization. Defaults to 1.
min_minima (bool, optional) – The type of minima to return. If True, returns the minimum minima found. If False, returns the minima closest to the mean of all minima found. Defaults to True.
save_model (bool, optional) – Whether to save the best mixed model or not. Defaults to False.
save_path (str, optional) – Path to save the best mixed model if save_model is True. Defaults to ‘./syn_data/model/’.
- Returns:
tuple – A tuple containing the best mixed model and a list of results from the optimization.
- src.save_best_mixed_model(best_mixed_model, save_path: str = './syn_data/model/')
Saves the best mixed model’s dynamic parameters (specifically the neural nets) to a specified path.
- Parameters:
best_mixed_model (callable) – The best mixed model to be saved.
save_path (str) – The directory where the model should be saved.
- src.plot_loss_in_2d_in_pairwise_parameter_combi(res, figsize_per_subplot=4, save_path=None)
Plot all pairwise 2D partial dependence plots of parameters in res on a grid. Adds a colorbar and axis labels for each subplot and shows the figure.
- Parameters:
res – skopt result object (must have res.space.dimensions).
figsize_per_subplot – size per subplot in inches (default 4).
save_path – path to save the figure (default None, which means it won’t be saved).
- Returns:
matplotlib.figure.Figure object
- src.make_best_mixed_model_from_params(f: PipelineTunerLossFunction, best_pca_combi: list, save_model: bool = False, save_path: str = './syn_data/model/')
Creates the best mixed model from the given PCA components and saves it if required.
- Parameters:
f (PipelineTunerLossFunction) – An instance of the PipelineTunerLossFunction class that contains the loss function to be minimized.
best_pca_combi (list) – The best PCA components to be used for creating the mixed model.
save_model (bool, optional) – Whether to save the best mixed model or not. Defaults to False.
save_path (str, optional) – Path to save the best mixed model if save_model is True. Defaults to ‘./syn_data/model/’.
- Returns:
callable – The best mixed model created from the given PCA components.