src.pulsarsa package

Subpackages

src.pulsarsa.tools package

Submodules

src.pulsarsa.information_packet_formats module

class src.pulsarsa.information_packet_formats.Payload(freqs: list[float], bandwidths: list[float] | None = None)

Bases: object

This class is used to store radio packet data.

plot(type: str = 'img')

Plots the dataframe

Parameters:: type (str, optional) – _description_. Defaults to ‘img’.
Returns:: plt.axes – returns the axes of the plot

add_flux(radio_packet: list[list[float]])

This function adds the flux to the dataframe. The flux is a list of lists, where each list is a row of the dataframe. The first element of the list is the flux and the second element is the frequency. The function checks if the frequencies of the radio packet match the frequencies of the payload. If they do, it adds the flux to the dataframe. If they don’t, it raises an error.

Parameters:: radio_packet (list[list[float]]) – list of lists, where each list is a row of the dataframe. The first element of the list is the flux and the second element is the frequency.

add_description(description: dict)

This function adds a description to the payload. The description is a dictionary with the keys ‘Pulsar’, ‘NBRFI’, ‘BBRFI’. The values of the dictionary are the descriptions of the pulsar, NBRFI and BBRFI respectively.

Parameters:: description (dict) – dictionary with the keys ‘Pulsar’, ‘NBRFI’, ‘BBRFI’. The values of the dictionary are the number of the pulsar, NBRFI and BBRFI respectively.

assign_bandwidths_to_freqchannels(bandwidths: list[float])

This function assigns bandwidths to the freqs

Parameters:: bandwidths (list[float]) – list of bandwidths.

Note

The bandwidths are assigned to the frequencies in the same order as the frequencies. The bandwidths list length must match the length of the frequencies.

assign_rot_phases(rot_phases: list[float])

This function assigns the rotation phases to the dataframe. The rotation phases are a list of floats

Parameters:: rot_phases (list[float]) – _description_

write_payload_to_jsonfile(file_name: str)

This method writes the payload to a json file.

Parameters:: file_name (str) – path + name of the file to write to

classmethod read_payload_from_jsonfile(filename: str)

This method reads the payload from a json file.

Parameters:: filename (str) – path + name of the file to read from
Returns:: Payload – returns the payload object

src.pulsarsa.neural_network_models module

class src.pulsarsa.neural_network_models.UNet(in_channels=1, out_channels=1, init_features=32)

Bases: Module

This model is taken from the original paper of UNet available in PyTorch website. Some modification is done in terms of input and output channels and kernel size.

Parameters:: nn (_type_) – _description_

class src.pulsarsa.neural_network_models.FilterCNN(in_channels=1, out_channels=1, init_features=32)

Bases: Module

This is a simple encoder-decoder architecture for filtering the segmented pulsar signals.

class src.pulsarsa.neural_network_models.WeightedBCELoss(pos_weight=1.0, neg_weight=1.0, eps=1e-07)

Bases: Module

forward(inputs, targets)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

src.pulsarsa.pipeline_methods module

src.pulsarsa.pipeline_methods.resize_image_linear(img, new_shape)

Resize 2D image to new_shape using linear interpolation.

Parameters:

img – np.ndarray Input 2D array (image)
new_shape – tuple Desired shape (a, b)

Returns:

np.ndarray: Resized image of shape new_shape

class src.pulsarsa.pipeline_methods.ImageReader(file_type: type = <src.pulsarsa.information_packet_formats.Payload object>, resize_size: tuple = (256, 256), do_average: bool = False, do_binarize: bool = False)

Bases: object

Class ImageReader acts as an engine to load/prepare images from payload files or numpy arrays

static read_from_payload(filename: str, resize_size: tuple = (128, 128), do_average: bool = False, do_binarize: bool = False)

Method to load freq-time image from payload files

Parameters:

filename (str) – full address to payload file
resize_size (tuple, optional) – output shape of the loaded image. Defaults to (128, 128).
do_average (bool, optional) – If True then the image is created by averaging phase values over many rotations. Defaults to False.
do_binarize (bool, optional) – If True then the image is binarized. Defaults to False.

Returns:

(np.ndarray) – loaded image

class src.pulsarsa.pipeline_methods.LabelReader(file_type: type = <src.pulsarsa.information_packet_formats.Payload object>)

Bases: object

This class acts as an engine to read the labels from files of type payload

static read_from_payload(filename: str)

Method to read the label from pyload file

Parameters:: filename (str) – full path to the payload file
Returns:: dict – dictionary containing details of the payload file

static read_from_str(label_memomory_map: str): Method to read the label from numpy file

static correct_key_names(description: dict)

class src.pulsarsa.pipeline_methods.ImageDataSet(image_tag: str, image_directory: str, image_reader_engine: ~src.pulsarsa.pipeline_methods.ImageReader | ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = <src.pulsarsa.pipeline_methods.ImageReader object>)

Bases: object

This class is used to represent or memory map a set of images

plot(idx)

plots the image from the set represented by the idx

Parameters:: idx (int) – index of the image
Returns:: plt.axis – axis of the plot

class src.pulsarsa.pipeline_methods.LabelDataSet(image_tag: str, image_directory: str, label_reader_engine: ~src.pulsarsa.pipeline_methods.LabelReader = <src.pulsarsa.pipeline_methods.LabelReader object>)

Bases: object

This class is used to represent or memory map a set of labels of the images

plot(idx)

prints the label of idx image

Parameters:: idx (int) – index representing the image
Returns:: dict – description

class src.pulsarsa.pipeline_methods.PipelineImageToMask(image_to_mask_network: Module, trained_image_to_mask_network_path: str)

Bases: object

Class implementing methods in sequence to generate segmented freq-time Image from freq-time Image

image_to_mask_method(image: ndarray)

Method to convert image to mask

Parameters:: image (np.ndarray) – image
Returns:: image (np.ndarray) – mask

plot(image: ndarray)

plots the image from the set represented by the idx

Parameters:: image (np.ndarray) – image
Returns:: plt.axis – axis of the plot

class src.pulsarsa.pipeline_methods.PipelineImageToDelGraphtoIsPulsar(image_to_mask_network: Module, trained_image_to_mask_network_path: str, signal_to_label_network: Module, trained_signal_to_label_network: str)

Bases: object

Class implementing methods in sequence to generate segmented freq-time Image then Delay graph then to determine if pulsar is there

get_nns_from_pipeline()

image_to_mask_method(image: ndarray)

Method to convert image to mask

Parameters:: image (np.ndarray) – image
Returns:: image (np.ndarray) – mask

mask_to_signal_method(mask)

Method to convert mask to delaygraph and extract lags as signal

Parameters:: image (np.ndarray) – mask
Returns:: np.ndarray – x_lags as signal

signal_to_label_method(signal: ndarray)

Method to determine if pulsar is there based on signal

Parameters:: signal (np.ndarray) – x_lags as signal
Returns:: float – probability if pulsar is there

validate_efficiency(image_data_set: ImageDataSet, label_data_set: LabelDataSet)

Method to validate efficiency of the pipeline from image and label dataset

Parameters:

image_data_set (ImageDataSet) – image dataset
label_data_set (LabelDataSet) – label dataset

Returns:

float – efficiency measure

display_results_in_batch(image_data_set: ImageDataSet, mask_data_set: ImageDataSet, label_data_set: LabelDataSet, randomize: bool = True, ids_toshow: list = [0, 1], batch_size: int = 2)

Plot results of the pipeline with step outputs and comparison with pre-labelled dataset

Parameters:

image_data_set (ImageDataSet) – Image dataset
mask_data_set (ImageDataSet) – Mask dataset of the image dataset
label_data_set (LabelDataSet) – Label dataset of the images
randomize (bool, optional) – If True, randomly chooses images from the dataset. Defaults to True.
ids_toshow (list, optional) – If radomize = False, then choose ids_show from dataset. Defaults to [0, 1].
batch_size (int, optional) – If randomize=True, then chooses batch_size images from set. Defaults to 2.

test_on_real_data_from_npy_files(image_data_set: memmap, image_label_set: memmap | None = None, plot_details: bool = False, plot_randomly: bool = True, batch_size: int = 5)

Method to test pipeline on .npy file dataset

Parameters:

image_data_set (np.memmap) – image dataset as numpy array
image_label_set (np.memmap | None, optional) – label dataset as numpy array. Defaults to None.
plot_details (bool, optional) – if True then plot the results. Defaults to False.
plot_randomly (bool, optional) – If True then randomly choose images from dataset. Defaults to True.
batch_size (int, optional) – number of images to test. minimum is 2. Defaults to 5.

class src.pulsarsa.pipeline_methods.PipelineImageToFilterDelGraphtoIsPulsar(image_to_mask_network: Module, trained_image_to_mask_network_path: str, mask_filter_network: Module, trained_mask_filter_network_path: str, signal_to_label_network: Module, trained_signal_to_label_network: str, skip_filter: bool = False)

Bases: object

Class implementing methods in sequence to generate segmented freq-time Image, filter it, then Delay graph then to determine if pulsar is there

skip_filtering(skip_filter: bool = False)

Method to skip filter step

Parameters:: skip_filter (bool, optional) – If True then skip filter step. Defaults to False.

get_nns_from_pipeline()

image_to_mask_method(image: ndarray)

Method to convert image to mask

Parameters:: image (np.ndarray) – image
Returns:: image (np.ndarray) – mask

filter_mask_method(pred_binarized: ndarray)

Method to filter out wrong segments in the segmented mask

Parameters:: pred_binarized (np.ndarray) – segmented mask to filter
Returns:: np.ndarray – filtered segmented mask

mask_to_signal_method(mask)

Method to convert mask to delaygraph and extract lags as signal

Parameters:: image (np.ndarray) – mask
Returns:: np.ndarray – x_lags as signal

signal_to_label_method(signal: ndarray)

Method to determine if pulsar is there based on signal

Parameters:: signal (np.ndarray) – x_lags as signal
Returns:: float – probability if pulsar is there

validate_efficiency(image_data_set: ImageDataSet, label_data_set: LabelDataSet)

Method to validate efficiency of the pipeline from image and label dataset

Parameters:

image_data_set (ImageDataSet) – image dataset
label_data_set (LabelDataSet) – label dataset

Returns:

float – efficiency measure

Plot results of the pipeline with step outputs and comparison with pre-labelled dataset

Parameters:

image_data_set (ImageDataSet) – Image dataset
mask_data_set (ImageDataSet) – Mask dataset of the image dataset
label_data_set (LabelDataSet) – Label dataset of the images
randomize (bool, optional) – If True, randomly chooses images from the dataset. Defaults to True.
ids_toshow (list, optional) – If radomize = False, then choose ids_show from dataset. Defaults to [0, 1].
batch_size (int, optional) – If randomize=True, then chooses batch_size images from set. Defaults to 2.

test_on_real_data_from_npy_files(image_data_set: memmap, image_label_set: memmap | None = None, plot_details: bool = False, plot_randomly: bool = True, batch_size: int = 5)

Method to test pipeline on .npy file dataset

Parameters:

image_data_set (np.memmap) – image dataset as numpy array
image_label_set (np.memmap | None, optional) – label dataset as numpy array. Defaults to None.
plot_details (bool, optional) – if True then plot the results. Defaults to False.
plot_randomly (bool, optional) – If True then randomly choose images from dataset. Defaults to True.
batch_size (int, optional) – number of images to test. minimum is 2. Defaults to 5.

class src.pulsarsa.pipeline_methods.PipelineImageToCCtoLabels(image_to_mask_network: Module, trained_image_to_mask_network_path: str, min_cc_size_threshold: int = 10)

Bases: object

Class implementing methods in sequence to generate segmented freq-time Image, then CC then to determine CCs to categories

get_nns_from_pipeline()

This method returns the neural networks used in the pipeline

Returns:: list – list of neural networks

image_to_mask_method(image: ndarray)

Method to convert image to mask

Parameters:: image (np.ndarray) – image
Returns:: image (np.ndarray) – mask

mask_to_labelled_skeleton_method(mask: ndarray)

Method to make labelled skeleton from mask

Parameters:: mask (np.ndarray) – segmented mask
Returns:: (np.ndarray) – labelled skeleton

labelled_skeleton_to_labels_method(labelled_skeleton: ndarray, return_detailed_results: bool = False)

Method to analyze each labels of the labelled skeleton

Parameters:

labelled_skeleton (np.ndarray) – labelled skeleton
return_detailed_results (bool, optional) – _description_. Defaults to False.

Returns:

results in list of each labels

Plot results of the pipeline with step outputs and comparison with pre-labelled dataset

Parameters:

image_data_set (ImageDataSet) – Image dataset
label_data_set (LabelDataSet) – Label dataset of the images
mask_data_set (ImageDataSet) – Mask dataset of the image dataset
randomize (bool, optional) – If True, randomly chooses images from the dataset. Defaults to True.
ids_toshow (list, optional) – If radomize = False, then choose ids_show from dataset. Defaults to [0, 1].
batch_size (int, optional) – If randomize=True, then chooses batch_size images from set. Defaults to 2.

test_on_real_data_from_npy_files(image_data_set: memmap, image_label_set: memmap | None = None, plot_randomly: bool = True, batch_size: int = 5)

Method to test pipeline on .npy file dataset

Parameters:

image_data_set (np.memmap) – image dataset as numpy array
image_label_set (np.memmap | None, optional) – label dataset as numpy array. Defaults to None.
plot_details (bool, optional) – if True then plot the results. Defaults to False.
plot_randomly (bool, optional) – If True then randomly choose images from dataset. Defaults to True.
batch_size (int, optional) – number of images to test. minimum is 2. Defaults to 5.

measure_accuracy(image_data_set: memmap, label_data_set: memmap, plot_results: bool = False, return_specific_key: str | None = None)

validate_efficiency(image_data_set: ImageDataSet, label_data_set: LabelDataSet, plot_results: bool = True, return_specific_key: str | None = None)

Method to validate efficiency of the pipeline from image and label dataset

Parameters:

image_data_set (ImageDataSet) – image dataset
label_data_set (LabelDataSet) – label dataset
plot_results (bool, optional) – If True then plot the results. Defaults to True.
return_specific_key (str | None, optional) – If provided, returns only the results for this specific key. Defaults to None.

Returns:

tuple – true_scores,true_negative_scores,false_scores,signal_present,f1_scores, precision, recall, accuracy

class src.pulsarsa.pipeline_methods.PipelineImageToFilterToCCtoLabels(image_to_mask_network: Module, trained_image_to_mask_network_path: str, mask_filter_network: Module, trained_mask_filter_network_path: str, min_cc_size_threshold: int = 10, skip_filter: bool = False, min_axis_ratio: float = 4.0, allow_image_as_input_to_filter: bool = False, remove_backgound: bool = True, one_pulse_mode: bool = True, return_snr: bool = False, box_func_window=5, snr_thresh=2, corr_thresh=5)

Bases: object

Class implementing methods in sequence to generate a segmented frequency-time image, filter it, perform connected components (CC) analysis, and categorize the components.

The processing pipeline consists of the following steps:

Image-to-mask conversion using a neural network.
Mask filtering using a neural network.
Mask-to-labeled skeleton conversion using connected components.
Labeled skeleton to category/label identification.

Parameters:

image_to_mask_network (nn.Module) – Neural network to convert image to mask.
trained_image_to_mask_network_path (str) – Path to trained weights of the image-to-mask network.
mask_filter_network (nn.Module) – Neural network to filter mask.
trained_mask_filter_network_path (str) – Path to trained weights of the mask filter network.
min_cc_size_threshold (int) – Minimum size of a connected component to be considered valid.
skip_filter (bool) – If True, skip the filtering step.
min_axis_ratio (float) – Minimum axis ratio of a connected component to be considered valid.
allow_image_as_input_to_filter (bool) – If True, allow the image as input to the filter network.
remove_background (bool) – If True, remove background from the image before processing.
one_pulse_mode (bool) – If True, use one-pulse mode. Clustering and regularization select the best cluster of CCs, which is then identified as a single category.
return_snr (bool) – If True, return SNR in the results.
box_func_window (int) – Window size for boxcar function to align pulses in each channel on the lines from each cluster.
snr_thresh (float) – Threshold for SNR to consider a cluster as a valid pulse.
corr_thresh (float) – Threshold for correlation to identify a valid pulse component in a channel for regularization.

__call__(image: np.ndarray, return_steps: bool = False): Runs the pipeline on the given image.

plot(image: ndarray, return_steps: bool = False)

Method to plot the results of the pipeline :param image: input image :type image: np.ndarray :param return_steps: If True, return intermediate steps. Defaults to False. :type return_steps: bool, optional

Returns:: matplotlib.axes.Axes – Axes object with the plots

remove_background(image: ndarray) → ndarray

Method to remove background from the image using gaussian filtering for gradient type background

Parameters:: image (np.ndarray) – input image
Returns:: np.ndarray – image with background removed

skip_filtering(skip_filter: bool = True)

Method to skip filtering step in the pipeline

Parameters:: skip_filter (bool, optional) – If True, then skip filtering step. Defaults to True.

get_nns_from_pipeline()

This method returns the neural networks used in the pipeline

Returns:: list – list of neural networks used in the pipeline

image_to_mask_method(image: ndarray)

Method to convert image to mask

Parameters:: image (np.ndarray) – image
Returns:: image (np.ndarray) – mask

mask_to_labelled_skeleton_method(mask: ndarray)

Method to make labelled skeleton from mask

Parameters:: mask (np.ndarray) – segmented mask
Returns:: (np.ndarray) – labelled skeleton

filter_mask_method(pred_binarized: ndarray)

Method to filter out wrong segments in the segmented mask

Parameters:: pred_binarized (np.ndarray) – segmented mask to filter
Returns:: np.ndarray – filtered segmented mask

filter_imagemask_method(image_pred_binarized: ndarray) → ndarray

Filter out wrong segments in the segmented mask.

Parameters:: image_pred_binarized (np.ndarray) – input with 2 channels (image + in_mask), shape (H, W, 2)
Returns:: np.ndarray – filtered and binarized segmented mask, shape (H, W)

analyse_signal_noise_segement_ratio(binary_filtered_mask: ndarray, thresh: float = 0.3)

Method to analyze the signal to noise segment ratio in the filtered mask :param binary_filtered_mask: filtered mask :type binary_filtered_mask: np.ndarray :param thresh: minimum signal to noise segment ratio to consider a segment as valid. Defaults to 0.5. :type thresh: float, optional

Returns:: bool – True if the signal to noise segment ratio is less then threshold, False otherwise

labelled_skeleton_to_labels_method(labelled_skeleton: ndarray, return_detailed_results: bool = False)

Method to analyze each labels of the labelled skeleton

Parameters:

labelled_skeleton (np.ndarray) – labelled skeleton
return_detailed_results (bool, optional) – _description_. Defaults to False.

Returns:

results in list of each labels

labelled_skeleton_to_labels_for_one_pulse_method(labelled_skeleton: ndarray, real_image: ndarray, return_detailed_results: bool = False, return_snr: bool = False, snr_thresh=2, box_func_window=5, signal_length=20, top_n=5, corr_thresh=5)

Plot results of the pipeline with step outputs and comparison with pre-labelled dataset

Parameters:

image_data_set (ImageDataSet) – Image dataset
label_data_set (LabelDataSet) – Label dataset of the images
randomize (bool, optional) – If True, randomly chooses images from the dataset. Defaults to True.
ids_toshow (list, optional) – If radomize = False, then choose ids_show from dataset. Defaults to [0, 1].
batch_size (int, optional) – If randomize=True, then chooses batch_size images from set. Defaults to 2.

test_on_real_data_from_npy_files(image_data_set: memmap, image_label_set: memmap | None = None, plot_randomly: bool = True, batch_size: int = 5, save_plot_path: str | None = None)

Method to test pipeline on .npy file dataset

Parameters:

image_data_set (np.memmap) – image dataset as numpy array
image_label_set (np.memmap | None, optional) – label dataset as numpy array. Defaults to None.
plot_details (bool, optional) – if True then plot the results. Defaults to False.
plot_randomly (bool, optional) – If True then randomly choose images from dataset. Defaults to True.
batch_size (int, optional) – number of images to test. minimum is 2. Defaults to 5.
save_plot_path (str, optional) – If provided saves the plot in that location. Defaults to None

measure_accuracy(image_data_set: memmap, label_data_set: memmap, plot_results: bool = False, return_specific_key: str | None = None)

validate_efficiency(image_data_set: ImageDataSet, label_data_set: LabelDataSet, plot_results: bool = True, return_specific_key: str | None = None)

Method to validate efficiency of the pipeline from image and label dataset

Parameters:

image_data_set (ImageDataSet) – image dataset
label_data_set (LabelDataSet) – label dataset
plot_results (bool, optional) – If True then plot the results. Defaults to True.
return_specific_key (str | None, optional) – If provided, returns only the results for this specific key. Defaults to None.

Returns:

tuple – true_scores,true_negative_scores,false_scores,signal_present,f1_scores, precision, recall, accuracy

class src.pulsarsa.pipeline_methods.PipelineCustomFilter0(filter_size_params: list = [10, 5, 20, 'exp', 10], theta_list_params: list = [35, 20, 80, 'exp', 10], noise_amplitude: float = 0.06, feature_thickness: float = 1.0, binarization_thresh: float = 0.1, prominence_thresh: float = 30, prominence_smoothing_kernel_size: int = 3)

Bases: object

run_protocol_batch_compatible_track_time(image_data: Tensor)

run_protocol_batch_compatible(image_data: Tensor)

static minmax_per_channel(x, eps=1e-08)

static make_full_filter_bank(size_list, theta_list, thickness: int | list = 1.0, noise_amplitude=0.06)

static make_kernel(size: int, theta: float, thickness: float = 1.0, noise_amplitude: float = 0.06)

perform_conv(image_data)

perform_featuremap_reshaping(feature_maps)

perform_sigmoid_activation(feature_maps, stretching_factor: float = 255.0)

perform_segregation_metric(probs: Tensor, metric_type: str = 'sin')

perform_theta_based_sum(multiplied_maps)

perform_binarization(theta_maps)

perform_boundary_to_zero(binary_maps: Tensor, thickness: int = 1)

perform_theta_channel_prominence_of_peaks(binary_maps: Tensor)

perform_stack_union(binary_maps: Tensor)

static gaussian_kernel1d(kernel_size, sigma, device)

static detect_peaks_batch_compatible(signal: Tensor, min_prominence: float = 30.0)

static cylindrical_theta_shift(images: Tensor, theta_list)

static exponential_spreadbased_list(mean_theta, clamp: list, num_elements, squeeze_factor: float = 80)

reset_theta_list(theta_list)

class src.pulsarsa.pipeline_methods.PipelineCustomFilter(filter_size: list = array([11, 13, 15, ..., 55, 57, 59]), theta_list: list = array([20, 21, 22, ..., 77, 78, 79]), noise_amplitude: float = 0.06, feature_thickness: float = 1.0, binarization_thresh: float = 0.1, prominence_thresh: float = 30, prominence_smoothing_kernel_size: int = 3)

Bases: object

run_decision_tree(image_data: Tensor, theta_range: list, step_messages: bool = False)

run_protocol_batch_compatible_track_time(image_data: Tensor)

run_protocol_batch_compatible(image_data: Tensor)

static minmax_per_channel(x, eps=1e-08)

static make_dense_filterbank(size_list, theta_list, thickness, noise_amplitude)

static make_full_filter_bank(size_list, theta_list, thickness: int | list = 1.0, noise_amplitude=0.06)

static make_kernel(size: int, theta: float, thickness: float = 1.0, noise_amplitude: float = 0.06)

perform_conv(image_data)

perform_featuremap_reshaping(feature_maps)

perform_sigmoid_activation(feature_maps, stretching_factor: float = 255.0)

perform_segregation_metric(probs: Tensor, metric_type: str = 'sin')

perform_theta_based_sum(multiplied_maps)

perform_binarization(theta_maps)

perform_boundary_to_zero(binary_maps: Tensor, thickness: int = 1)

perform_theta_channel_prominence_of_peaks(binary_maps: Tensor)

perform_stack_union(binary_maps: Tensor)

static gaussian_kernel1d(kernel_size, sigma, device)

static detect_peaks_batch_compatible(signal: Tensor, min_prominence: float = 30.0)

static cylindrical_theta_shift(images: Tensor, theta_list)

static exponential_spreadbased_list_depprecated(mean_theta, clamp: list, num_elements, squeeze_factor: float = 80)

static exponential_spreadbased_list(mean_theta, clamp, num_elements=40, squeeze_factor=4)

reset_theta_list(theta_list)

select_theta_list(mask, theta_list)

src.pulsarsa.pipeline_methods.protocol_decision_tree(pipeline_obj: PipelineCustomFilter, image_data, theta_range)

src.pulsarsa.postprocessing module

src.pulsarsa.postprocessing.resize_image_linear(img, new_shape)

Resize 2D image to new_shape using linear interpolation.

Parameters:

img – np.ndarray Input 2D array (image)
new_shape – tuple Desired shape (a, b)

Returns:

np.ndarray: Resized image of shape new_shape

class src.pulsarsa.postprocessing.DelayGraph(normalize_delays: bool = True, do_smoothing: bool = True)

Bases: object

Class involving methods to generate delaygraph from freq-time image. DelayGraph is generated by calculating the lags with the mean signal with each of the freq channel signals

plot(dispersed_freq_time: ndarray)

plot the delay graph generated from dispersed freq-time image

Parameters:: dispersed_freq_time (np.ndarray) – dispersed pulsar freq-time image

static protocol(dispersed_freq_time: ndarray)

Method to generate the dispersed signal

Parameters:: dispersed_freq_time (np.ndarray) – dispersed pulsar freq-time image
Returns:: (1D np.ndarray,1D np.ndarray) – delaygraph signal as lags and freq-channel index

class src.pulsarsa.postprocessing.LineClassifier(no_pulsar_slope_range: tuple | None = None)

Bases: object

Class to classify delaygraphs into pulse or not pulse after fitting a line to delaygraph

protocol(y_channels_normalized: ndarray)

method to classify delaygraphs by fitting line

Parameters:

x_lags_normalized (np.ndarray) – lags from delay graph
y_channels_normalized (np.ndarray) – freq channel index

Returns:

(float,func,list) – slope of the fitted line, function handle used for fitting, fitted params

static normalize_coors(x_coors: ndarray, y_coors: ndarray)

Normalize lags from 0 to 1

Parameters:

x_coors (np.ndarray) – lags from delay graph
y_coors (np.ndarray) – freq channel index

Returns:

(np.ndarray,np.ndarray) – normalized lags and freq index

static correct_for_nans(x_coors: ndarray, y_coors: ndarray)

Replace nans with 0.5

Parameters:

x_coors (np.ndarray) – lags from delay graph
y_coors (np.ndarray) – freq channel index

Returns:

(np.ndarray,np.ndarray) – corrected lags and freq index

static smooth_curve(x_coors: ndarray, y_coors: ndarray, window_size: int = 6, polynomial_order: int = 4)

smooth lags

Parameters:

x_coors (np.ndarray) – lags from delay graph
y_coors (np.ndarray) – freq channel index
window_size (int, optional) – smoothing savgol filter size. Defaults to 6.
polynomial_order (int, optional) – polynomial order savgol filter size. Defaults to 4.

Returns:

(np.ndarray,np.ndarray) – smoothed lags and freq index

static fit_line_curve(x_coors: ndarray, y_coors: ndarray)

Fit line curve

Parameters:

x_coors (np.ndarray) – lags from delay graph
y_coors (np.ndarray) – freq channel index

Returns:

(float,func,list) – slope of the fitted line, function handle used for fitting, fitted params

plot(x_lags_normalized: ndarray, y_channels_normalized: ndarray)

plot the fitted line with the data

Parameters:

x_lags_normalized (np.ndarray) – normalized lags from delay graph
y_channels_normalized (np.ndarray) – freq channel index

class src.pulsarsa.postprocessing.ConnectedComponents(dilate: bool = True, small_component_size: int = 10, max_num_comp: int = 30, min_axis_ratio: float = 4.0, only_one_pulse_mode: bool = True)

Bases: object

Class to calculate connected components (CC) from segmented freq-time image to label signal types

ellipse_major_axis_protocol(dispersed_freq_time_segmented: ndarray, rm_verti_hori: bool = True) → ndarray

Replace elongated connected components in a binary image with a line along the major axis of their fitted ellipse, using skimage only.

Parameters:

dispersed_freq_time_segmented (np.ndarray) – Binary input image (2D).
rm_verti_hori (bool) – If True, remove vertical and horizontal lines.

Returns:

np.ndarray – Binary image with lines replacing elongated components.

protocol2(dispersed_freq_time_segmented: ndarray)

Method to perform CC analysis

Parameters:: dispersed_freq_time_segmented (np.ndarray) – binary mask of dispersed freq-time image
Returns:: np.ndarray – labelled skeleton image each label representing a connected component

protocol(dispersed_freq_time_segmented: ndarray)

Method to perform CC analysis

Parameters:: dispersed_freq_time_segmented (np.ndarray) – binary mask of dispersed freq-time image
Returns:: np.ndarray – labelled skeleton image each label representing a connected component

filter_if_exceeds_max_components(filtered_labelled_skeleton: ndarray)

static skeletonize_image(dispersed_freq_time_segmented: ndarray)

Skeletonize components of a binary image

Parameters:: dispersed_freq_time_segmented (np.ndarray) – binary mask of dispersed freq-time image
Returns:: (np.ndarray) – skeletonized image

static detect_nodes_in_skeleton(skeleton_img: ndarray, dilate: bool = True)

Method to put detect dots in regions where line elements from skeletonized image criss cross

Parameters:

skeleton_img (np.ndarray) – skeletonized binary dispersed freq-time image
dilate (bool, optional) – option to dilate the connected components after putting the dots. Defaults to False.

Returns:

(np.ndarray) – boolean numpy 2d array with value True in place of dots

static detect_right_angles(skeleton_img: ndarray, dilate: bool = True)

Method to detect right angles in the skeletonized image

Parameters:: skeleton_img (np.ndarray) – skeletonized binary dispersed freq-time image
Returns:: (np.ndarray) – boolean numpy 2d array with value True in place of right angles

static divide_branches_in_skeleton(skeleton_img: ndarray, node_pixels: ndarray)

Cut lines in skeleton where they criss cross

Parameters:

skeleton_img (np.ndarray) – skeletonized binary dispersed freq-time image
node_pixels (np.ndarray) – boolean numpy 2d array with value True in place of dots

Returns:

(np.ndarray) – skeletonized image with the lines or curves without any criss cross

static divide_branches_in_skeleton_right_angles(skeleton_img: ndarray, right_angle_pixels: ndarray)

Cut lines in skeleton where they form right angles

Parameters:

skeleton_img (np.ndarray) – skeletonized binary dispersed freq-time image
right_angle_pixels (np.ndarray) – boolean numpy 2d array with value True in place of right angles

Returns:

(np.ndarray) – skeletonized image with the lines or curves without any right angles

static label_skeleton(skeleton_img: ndarray)

label the skeleton with each label id representing each connected component

Parameters:: skeleton_img (np.ndarray) – skeletonized binary dispersed freq-time image
Returns:: np.ndarray – labelled skeleton image each label representing a connected component

static filter_out_small_components(labelled_skeleton_img: ndarray, small_component_size: int = 10)

Filter CC based on number of pixels

Parameters:

labelled_skeleton_img (np.ndarray) – labelled image each label representing a connected component
small_component_size (int, optional) – minimum CC size threshold. Defaults to 10.

Returns:

np.ndarray – filtered labelled image

plot(dispersed_freq_time_segmented: ndarray)

plots the results of the CC analysis

Parameters:: dispersed_freq_time_segmented (np.ndarray) – binary mask of dispersed freq-time image

static fit_eclipse_to_cc(coords)

static extract_cx_cy(coor, angle_with_x)

Extract the x-intercept and y-intercept of the line passing through (x0, y0) at given angle with x-axis.

Parameters:

coor (tuple) – (x0, y0) point on line
angle_with_x (float) – angle of line w.r.t. x-axis (radians)

Returns:

tuple – (x_intercept, y_intercept)

static extract_xs_ys(coor, angle_with_x, image_shape)

static line_normal_form(coor, angle_with_x)

Compute perpendicular distance of the line from origin and the angle of the perpendicular (normal) with x-axis.

Parameters:

coor (tuple) – (x0, y0) a point on the line
angle_with_x (float) – angle of line with x-axis (radians)

Returns:

tuple – (distance_from_origin, normal_angle)

static collect_line_params_from_coors(coors)

static label_binary_image(binary_image)

static plot_lines_on_cc(binary_image, cc_label)

static collect_line_params_from_ccs(binary_image): Extract line parameters from all connected components in a binary image and return as a single DataFrame with a label column.

static cluster_based_on_features(df: DataFrame, features: list[str] = ['x_intercept', 'y_intercept'], eps: float = 0.05, min_samples: int = 1, angle_tolerance: float = 5.0, plot: bool = False)

Cluster the connected components based on selected features.

Parameters:

df (pandas.DataFrame) – DataFrame containing feature columns.
features (list[str]) – List of columns to use for clustering.
eps (float) – DBSCAN eps parameter.
min_samples (int) – DBSCAN min_samples parameter.
angle_tolerance (float) – Tolerance in degrees for filtering horizontal/vertical lines.
plot (bool, optional) – Whether to plot the clusters in feature space. Defaults to False.

Returns:

tuple – labels (numpy.ndarray): Cluster labels for each row in df_filtered (re-ranked). df_filtered (pandas.DataFrame): Filtered DataFrame used for clustering.

static plot_clusters_from_ccs(binary_image, features: list[str] = ['x_intercept', 'y_intercept'], eps: float = 0.05)

static plot_lines_of_cluster(binary_image, real_image=None, features: list[str] = ['x_intercept', 'y_intercept'], eps: float = 0.05, cluster_id: int | None = None, cmap='viridis', box_width=5, bg_offset=5, top_n=5): Cluster connected components based on line parameters and plot the lines of a chosen cluster in real image space, colored by SNR. Also plots the sum of the sum_signals of the top N lines with highest SNR.

static detect_lines_from_cluster_based_on_snr(binary_image, real_image, features: list[str] = ['x_intercept', 'y_intercept'], eps: float = 0.05, cluster_ids: list[int] = [1, 2, 3], cmap='viridis', box_func_window=5, signal_length=20, top_n=5, snr_thresh=10, corr_thresh=5, plot: bool = True)

static calculate_line_points_from_intercepts(x_int, y_int, image_shape)

static compute_line_snr(image: ndarray, xs, ys, box_func_window=3, signal_length=20, corr_thresh=0.5, direction: str = 'horizontal')

src.pulsarsa.postprocessing.bresenham_line(y0, x0, y1, x1)

src.pulsarsa.postprocessing.align_and_pad_signals(signals)

Align 1D signals to their center and pad with random values.

Parameters:: signals (list of 1D numpy arrays)
Returns:: aligned (2D numpy array (n_signals, max_len))

src.pulsarsa.postprocessing.compute_line_intensity_profile(image: ndarray, xs, ys, box_width=5, direction: str = 'perpendicular'): Compute line intensity profiles along a line with option to sample perpendicular to line or horizontally (parallel to x-axis).

src.pulsarsa.postprocessing.find_rowwise_centers_with_box_regularized(profiles: ndarray, box_width: int, subband_width: int, corr_thresh: float = 0.3) → ndarray

Divide rows into subbands, sum rows in each subband, find the column position of maximum correlation with a box function, and regularize centers across subbands to avoid artificial lines. Also limit jumps: if new center > prev ± 2*box_width, use prev_center.

Parameters:

profiles (2D np.ndarray (n_rows, n_cols)) – Intensity profiles.
box_width (int) – Width of the box function.
subband_width (int) – Number of rows to group into one subband.
corr_thresh (float) – Minimum correlation (normalized) to accept box peak.

Returns:

positions (1D np.ndarray of shape (n_rows,)) – Column centers for each row (same within subband).

src.pulsarsa.postprocessing.snr_from_sum_profile(sum_profile: ndarray, box_width: int)

Calculate SNR from a 1D summed profile using a sliding box.

Parameters:

sum_profile (1D np.ndarray) – Summed signal along the line/subband.
box_width (int) – Width of the box to consider as the signal region.

Returns:

snr (float) – Estimated SNR.
signal_mean (float) – Mean of the signal region.
noise_mean (float) – Mean of the noise region.
noise_std (float) – Standard deviation of the noise region.
signal_center (int) – Column index of the signal peak (center of box).

src.pulsarsa.postprocessing.align_profiles_to_positions_with_scaling(profiles: ndarray, positions: ndarray, scaling_factors=2)

Align and scale each row of a 2D profile matrix to specified positions and scaling factors.

Parameters:

profiles (2D np.ndarray (n_rows, n_cols)) – Original intensity profiles.
positions (1D np.ndarray of shape (n_rows,)) – Column index to align each row to (center).
scaling_factors (1D np.ndarray of shape (n_rows,)) – Scaling factor for each row (e.g. 1.2 = zoom in, 0.8 = zoom out).

Returns:

aligned_profiles (2D np.ndarray (n_rows, n_cols)) – Aligned and scaled intensity profiles.

src.pulsarsa.postprocessing.align_profiles_to_positions(profiles: ndarray, positions: ndarray)

Align each row of a 2D profile matrix to the specified positions.

Parameters:

profiles (2D np.ndarray (n_rows, n_cols)) – Original intensity profiles.
positions (1D np.ndarray of shape (n_rows,)) – Column index to align each row to (center).

Returns:

aligned_profiles (2D np.ndarray (n_rows, n_cols)) – Aligned intensity profiles.

src.pulsarsa.postprocessing.compute_line_snr_simple(image, xs, ys, signal_length, box_func_window=5, direction='horizontal', subband_width=10, corr_thresh=6)

class src.pulsarsa.postprocessing.FitSegmentedTraces(labelled_skeleton: ndarray)

Bases: object

Class to give a cattegory to connected components in a skeleton image by fitting a line

protocol(label_id: int)

_summary_

Parameters:: label_id (int) – _description_
Returns:: (np.ndarray,np.ndarray,list,function,str,int) – returns coordinates of the label skeleton,fitted params, fitted function, category detected, number of points

static extract_coors(labelled_skeleton: ndarray, label_id: int)

Extract coors of a label

Parameters:

labelled_skeleton (np.ndarray) – labelled skeleton image
label_id (int) – _description_

Returns:

(np.ndarray,np.ndarray) – returns coordinates of the label skeleton

static categorize_trace(m: float)

Method to categorize based on slope of the fitted line

Parameters:: m (float) – slope
Returns:: str – category

static fit_linear_curve(x_coors_sorted, y_coors_sorted)

Method to fit a line curve to the coors of the skeleton label

Parameters:

x_coors_sorted (_type_) – x coordinates of the label skeleton
y_coors_sorted (_type_) – y coordinates of the label skeleton

Returns:

(list,function) – fitted parameter,fitted function

plot(label_id: int)

plot results specific to a label

Parameters:: label_id (int) – label value representing a cc in the skeleton image

classmethod fitt_to_all_traces(labelled_skeleton: ndarray)

Method to categorize all labels

Parameters:: labelled_skeleton (np.ndarray) – labelled skeleton
Returns:: list – results specific to each label in a list

classmethod plot_all_traces(labelled_skeleton: ndarray)

Plot results from all labels

Parameters:: labelled_skeleton (np.ndarray) – labelled skeleton

classmethod return_detected_categories(labelled_skeleton: ndarray, return_detailed_results: bool = False)

Method that summarizes the results into a dict of categories with values of each keys representing the mean length of the categories

Parameters:

labelled_skeleton (np.ndarray) – labelled skeleton
return_detailed_results (bool, optional) – if true returns the results in list of each labels . Defaults to False.

Returns:

list – results in list of each labels

classmethod plot_all_traces_with_categories(labelled_skeleton: ndarray, image: ndarray | None = None)

Plots detailed result of each labels

Parameters:

labelled_skeleton (np.ndarray) – labelled_skeleton
image (np.ndarray, optional) – image to superimpose the results. Defaults to None.

src.pulsarsa.preprocessing module

src.pulsarsa.preprocessing.resize_image_linear(img, new_shape)

Resize 2D image to new_shape using linear interpolation.

Parameters:

img – np.ndarray Input 2D array (image)
new_shape – tuple Desired shape (a, b)

Returns:

np.ndarray: Resized image of shape new_shape

class src.pulsarsa.preprocessing.BinarizeToMask(binarize_func: str = 'thresh')

Bases: object

Class to define methods to binarize images

plot(image: ndarray)

Method to plot the binarized image

Parameters:: image (np.ndarray) – “2D image Values range from 0-1”
Returns:: image (np.ndarray) – binarized mask

binarize_protocol(image: ndarray)

call the binarize protocol method in an if condition (self.binarizing_func == ‘binarize_func’)

Parameters:: image (np.ndarray) – image to binarize
Returns:: image (np.ndarray) – binarized image

static thresh_method(image: ndarray) → ndarray

static thresh_ni_method(image: ndarray, window_size: int = 15)

static gaussian_blurr(image: ndarray, sigma: int = 3)

Gaussian method to be called with binarize_func: str = “gaussian_blur” while initializing

Parameters:

image (np.ndarray) – image to binarize
sigma (int, optional) – gaussian kernel. Defaults to 3.

Returns:

image (np.ndarray) – binarized image

static exponential_method(image: ndarray)

Exponential method to be called with binarize_func: str = “gaussian_blur” while initializing

Parameters:: image (np.ndarray) – image to binarize
Returns:: image (np.ndarray) – binarized image

class src.pulsarsa.preprocessing.PrepareFreqTimeImage(do_rot_phase_avg: bool = True, do_resize: bool = True, do_binarize: bool = False, resize_size: tuple = (256, 256), binarize_engine: ~src.pulsarsa.preprocessing.BinarizeToMask = <src.pulsarsa.preprocessing.BinarizeToMask object>)

Bases: object

Class to implement methods to load and pre process radio payloads to freq-time image

preparation_protocol(payload: Payload)

Protocol to call methods to prepare freq-time graphs

Parameters:: payload (Payload) – Payload class object made during the simulation
Returns:: image (np.ndarray) – freq-time image

plot(payload_address: str)

plot the freq-time graph loaded from payload file

Parameters:: payload_address (str) – path to the payload file

average_payload_rotphase(payload: Payload)

Method to averge payload to 0-360 rotphase having multiple rotations

Parameters:: payload (Payload) – Payload class object made during the simulation
Returns:: image (np.ndarray) – freq-time image

src.pulsarsa.train_neural_network_model module

class src.pulsarsa.train_neural_network_model.ImageMaskPair(image: Tensor | None = None, mask: Tensor | None = None, descriptions: tuple[dict] = ({}, {}))

Bases: object

Class to pair Image and Mask

update_descriptions(descriptions: tuple[dict])

Method to add descriptions about image and mask

Parameters:: descriptions (tuple[dict]) – description

plot()

Plot Image and its mask

Returns:: ndarray – current axis of the plot

classmethod load_from_payload_address(image_payload_address: str, mask_payload_address: str, image_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': False, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>}, mask_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': True, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>})

Method to load from payload files

Parameters:

image_payload_address (str) – full address to the image payload file
mask_payload_address (str) – full address to the mask payload file
image_engine (PrepareFreqTimeImage, optional) – engine to load image from payload file. Defaults to PrepareFreqTimeImage(do_rot_phase_avg=True,do_binarize=False,do_resize=True).
mask_engine (PrepareFreqTimeImage, optional) – engine to load image from payload file. Defaults to PrepareFreqTimeImage(do_rot_phase_avg=True,do_binarize=True,do_resize=True).

Returns:

(ImageMaskPair) – ImageMaskPair object with loaded image and mask

classmethod load_from_payload_and_make_in_mask(image_payload_address: str, mask_payload_address: str, mask_maker_engine: ~src.pulsarsa.pipeline_methods.PipelineImageToMask, image_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': False, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>}, mask_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': True, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>})

Method to load image from payload and its mask. The loaded image is then converted to mask using a mask generator to be used as input mask (in_mask). This in_mask and mask pair is used for training filter networks

Parameters:

image_payload_address (str) – full address to the image payload file
mask_payload_address (str) – full address to the mask payload file
mask_maker_engine (PipelineImageToMask) – engine to make mask from image
image_engine (PrepareFreqTimeImage, optional) – engine to load image from payload file. Defaults to PrepareFreqTimeImage(do_rot_phase_avg=True,do_binarize=False,do_resize=True).
mask_engine (PrepareFreqTimeImage, optional) – engine to load image from payload file. Defaults to PrepareFreqTimeImage(do_rot_phase_avg=True,do_binarize=True,do_resize=True).

Returns:

(ImageMaskPair) – ImageMaskPair object with loaded image and mask

class src.pulsarsa.train_neural_network_model.ImageToMaskDataset(image_tag: str, mask_tag: str, image_directory: str, mask_directory: str, image_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': False, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>}, mask_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': True, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>}, device: ~torch.device = device(type='cpu'))

Bases: Dataset

Class to represent Image Mask dataset

plot(index)

Plot image mask pair represented by index

Parameters:: index (int) – index of the pair

class src.pulsarsa.train_neural_network_model.InMaskToMaskDataset(image_tag: str, mask_tag: str, image_directory: str, mask_directory: str, mask_maker_engine: ~src.pulsarsa.pipeline_methods.PipelineImageToMask, image_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': False, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>}, mask_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': True, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>}, device: ~torch.device = device(type='cpu'))

Bases: Dataset

Class to represent InMask Mask dataset

plot(index)

Plot InMask and Mask pair

Parameters:: index (int) – index of the pair to plot

class src.pulsarsa.train_neural_network_model.ImageInMaskToMaskDataset(image_tag: str, mask_tag: str, image_directory: str, mask_directory: str, mask_maker_engine: ~src.pulsarsa.pipeline_methods.PipelineImageToMask, image_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': False, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>}, mask_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': True, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>}, device: ~torch.device = device(type='cpu'))

Bases: Dataset

Class to represent Image+InMask Mask dataset

plot(index): Plot the 2-channel input (image + in_mask) and the target mask.

class src.pulsarsa.train_neural_network_model.TrainImageToMaskNetworkModel(num_epochs: int, loss_criterion: Module = CustomLossUNet(), store_trained_model_at: str = './syn_data/model/trained_unet_test_v0.pt', model: Module | None = None, train_test_split: float = 0.8, learning_rate: float = 0.001, batch_size: int = 10)

Bases: object

Class involving methods to train Image (or InMask) to Mask Network

train_model(image_mask_pairset: ImageToMaskDataset | InMaskToMaskDataset | ImageInMaskToMaskDataset, early_stopping_patience: int = 5, plot: bool = False, plot_path: str | None = None, label_dataset: LabelDataSet | None = None, preferred_label: str | None = None, plot_validation_samples_at: str | None = None, ml_flow_folder: str | None = None, ml_flow_exp_name: str | None = None)

Train the network with validation, early stopping, and optional plotting.

Parameters:

image_mask_pairset (ImageToMaskDataset|InMaskToMaskDataset|ImageInMaskToMaskDataset) – Dataset containing image-mask pairs.
early_stopping_patience (int) – Epochs to wait after no val loss improvement.
plot (bool) – Whether to show a loss curve plot. Default is False.

Returns:

(list, list, list) – epochs, training losses, validation losses

test_model(image: tensor, plot_pred: bool = False)

Method to test the network

Parameters:

image (torch.tensor) – image to convert to mask
plot_pred (bool, optional) – If True, plots the prediction from the NN. Defaults to False.

Returns:

(np.ndarray) – prediction by the network as predicted mask

class src.pulsarsa.train_neural_network_model.SignalLabelPair(signal: Tensor | None = None, label: dict | None = None)

Bases: object

Class to represent Signal and label pair

plot()

Method to plot Signal and label

Returns:: np.ndarray – current axis of the plot

classmethod load_from_payload_address(mask_payload_address: str, mask_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': True, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>})

Method to load mask from payload and convert it to signal

Parameters:

mask_payload_address (str) – full path to the payload file of the mask
mask_engine (PrepareFreqTimeImage, optional) – engine to make mask from mask payload file. Defaults to PrepareFreqTimeImage( do_rot_phase_avg=True, do_binarize=True, do_resize=True ).

Returns:

SignalLabelPair – Signal Label Pair

class src.pulsarsa.train_neural_network_model.SignalToLabelDataset(mask_tag: str, mask_directory: str, mask_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': True, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>}, device: ~torch.device = device(type='cpu'))

Bases: Dataset

Class to represent Signal Label pair dataset

plot(index)

Method to plot Signal and label pair from the dataset

Parameters:: index (int) – index of the pair
Returns:: np.ndarray – current axes of the plot

class src.pulsarsa.train_neural_network_model.TrainSignalToLabelModel(num_epochs: int, loss_criterion: Module = BCELoss(), store_trained_model_at: str = './syn_data/model/trained_OneDconvEncoder_test_v0.pt', model: Module | None = None)

Bases: object

Class involving methods to train nn to classify signal into labels

train_model(signal_label_pairset: SignalToLabelDataset)

Method to train the NN

Parameters:: signal_label_pairset (SignalToLabelDataset) – signal label pair dataset
Returns:: (list,list) – epoch number and the loss in each epoch

test_model_from_signal(signal: tensor, plot_pred: bool = False)

Method to test the nn to classify a signal

Parameters:

signal (torch.tensor) – signal to classify
plot_pred (bool, optional) – If True, plots the signal and with prediction probability of each categories. Defaults to False.

Returns:

np.ndarray – Predicted labels are with probabilities

test_model(mask: ndarray, plot_pred: bool = False)

Method to test model from mask

Parameters:

mask (np.ndarray) – mask from which category probability is predicted after generating the signal
plot_pred (bool, optional) – If True plots the results. Defaults to False.

Returns:

np.ndarray – Predicted labels are with probabilities

src.pulsarsa.tune_parameters module

class src.pulsarsa.tune_parameters.TunerPCA(data_lists: ndarray, variance_threshold: float | None = None, normalize_features: bool = True, normalize_method: str = 'standard')

Bases: object

TunerPCA class is used to perform PCA on the input data and then modify the PCA components to generate new data points. The class is initialized with the input data and the variance threshold for PCA. The data_lists are numpy 2D arrays with the rows representing the samples and the the columns are the features.

perform_pca(variance_threshold: float | None = None)

modify_pca_components(scaled_pca_factors: ndarray)

perform_inverse_pca()

plot_variance_distribution()

class src.pulsarsa.tune_parameters.TunableParameterExtractor

Bases: object

static pull_nns_from_pipeline(pipeline: Module | PipelineImageToCCtoLabels | PipelineImageToFilterToCCtoLabels | PipelineImageToFilterDelGraphtoIsPulsar | PipelineImageToDelGraphtoIsPulsar)

static pull_type(extract_from: Module | PipelineImageToCCtoLabels | PipelineImageToFilterToCCtoLabels | PipelineImageToFilterDelGraphtoIsPulsar | PipelineImageToDelGraphtoIsPulsar)

static pull_parameters(extract_from: Module | PipelineImageToCCtoLabels | PipelineImageToFilterToCCtoLabels | PipelineImageToFilterDelGraphtoIsPulsar | PipelineImageToDelGraphtoIsPulsar)

static push_parameters(extracted_from, tunable_parameters_flattened, parameter_lengths)

static pull_learnable_params_from_torch_module(neural_network: Module)

static push_learnable_params_to_torch_module(neural_network: Module, learnable_params: ndarray)

static pull_learnable_params_from_pipeline(pipeline: PipelineImageToCCtoLabels | PipelineImageToFilterToCCtoLabels | PipelineImageToFilterDelGraphtoIsPulsar | PipelineImageToDelGraphtoIsPulsar)

static push_learnable_params_to_pipeline(pipeline: PipelineImageToCCtoLabels | PipelineImageToFilterToCCtoLabels | PipelineImageToFilterDelGraphtoIsPulsar | PipelineImageToDelGraphtoIsPulsar, learnable_params: ndarray, param_lengths: list)

class src.pulsarsa.tune_parameters.Tuner(sample_of_objects: list[Module | PipelineImageToCCtoLabels | PipelineImageToFilterToCCtoLabels | PipelineImageToFilterDelGraphtoIsPulsar | PipelineImageToDelGraphtoIsPulsar], show_steps: bool = True, reset_components: bool = True, variance_to_capture: float | int = 0.95, all_sliders: bool = False)

Bases: object

This class is used to tune the parameters of a neural network or a pipeline. It uses PCA to reduce the dimensionality of the parameters and then allows the user to modify the PCA components to generate new data points. This feature is in beta mode and subject to upgrade/change

Parameters:

sample_of_objects (list) – A list of objects that are either nn.Module or PipelineImageToCCtoLabels or PipelineImageToFilterToCCtoLabels or PipelineImageToFilterDelGraphtoIsPulsar or PipelineImageToDelGraphtoIsPulsar.
show_steps (bool, optional) – If True, it will print the steps of the tuning process. Defaults to True.
reset_components (bool, optional) – If True, it will reset the PCA components to the average scaled PCA factors. Defaults to True.
variance_to_capture (float|int, optional) – The variance to capture in PCA. Defaults to 0.95.
all_sliders (bool, optional) – If True, it will generate sliders for all PCA components. Defaults to False.

When a tuner instance is called with a folder_to_save argument, it will generate a mixed model and save the state_dict of the neural networks in the pipeline to the specified folder, and return the mixed model object.

generate_mixed_model_from_current_components(): Generates a mixed model obj from the current PCA components and returns it.

get_sample_pca_components(): Returns the sample PCA components used for tuning.

get_current_scaled_pca_factors(): Returns the current scaled PCA factors.

set_scaled_pca_factors(scaled_pca_factors: ndarray): Sets the scaled PCA factors to the given values.

generate_mixed_model_from_input_pca_factors(pca_factors: ndarray)

get_component_ranges()

src.pulsarsa.tuner_optimization module

class src.pulsarsa.tuner_optimization.PipelineTunerLossFunction(tuner: Tuner, image_dataset, label_dataset, sample_size, weights, save_model: bool = False, save_path: str = './syn_data/model/')

Bases: object

This class deals with calculating the loss function of the mixed model generated by mixing PCA components. In its current state it takes a Tuner object, image dataset, label dataset, sample size and weights for the loss function. For creating an instance of this class, you need to pass the Tuner object, image dataset, label dataset, sample size and weights.

Parameters:

tuner (Tuner) – The Tuner object that contains the PCA engine and other parameters.
image_dataset (np.ndarray) – numpy array of 2d images to be used for testing the mixed model.
label_dataset (np.ndarray) – numpy array of labels corresponding to the images For example [‘Pulsar+NBRFI’,’Pulsar’].
sample_size (int) – Number of samples to randomly select from the dataset for calculating loss.
weights (list) – Weights to be applied to the loss function for each label component.
save_model (bool, optional) – Whether to save the best mixed model or not. Defaults to False.
save_path (str, optional) – Path to save the best mixed model if save_model is True. Defaults to ‘./syn_data/model/’.

Note

The Tuner object should have been initialized with the PCA engine and the number of components to be used for mixing.
If the sample size is smaller than the number of images in the dataset, it will randomly select samples from the dataset which will lead to noise in the loss function. On the other hand, if the sample size is close to the number of images in the dataset, it will lead to a more stable loss function but in case of big datasets it will take a lot of time to calculate the loss function.

catalogue_data_set_containing_pulsar_nbrfi_bbrfi_none(): This function reads the input label_numpy set and catague the index containing pulsar,NBRFI,BBRFI,None

create_normalized_ori_labels(idx: list)

create_normalized_pred_labels(predictor, idx: list)

src.pulsarsa.tuner_optimization.find_the_best_mixed_pipeline(f: PipelineTunerLossFunction, n_calls=200, n_initial_points=10, minimizer: str = 'gp_minimize', random_seed: int | None = None, niter=1, min_minima: bool = True, save_model: bool = False, save_path: str = './syn_data/model/')

This function finds the best mixed model by optimizing the PCA components using a loss function. It uses the skopt library to minimize the loss function by varying the PCA components. The function returns the best mixed model and the results of the optimization.

Parameters:

f (PipelineTunerLossFunction) – An instance of the PipelineTunerLossFunction class that contains the loss function to be minimized.
n_calls (int, optional) – Number of calls to the loss function during optimization. Defaults to 200.
n_initial_points (int, optional) – Number of initial points to sample before optimization apart from the x0 points. Defaults to 10.
minimizer (str, optional) – Minimization method to use. Options are ‘gp_minimize’, ‘forest_minimize’. Defaults to ‘gp_minimize’.
random_seed (int, optional) – Random seed for reproducibility. Defaults to None.
niter (int, optional) – Number of iterations to run the optimization. Defaults to 1.
min_minima (bool, optional) – The type of minima to return. If True, returns the minimum minima found. If False, returns the minima closest to the mean of all minima found. Defaults to True.
save_model (bool, optional) – Whether to save the best mixed model or not. Defaults to False.
save_path (str, optional) – Path to save the best mixed model if save_model is True. Defaults to ‘./syn_data/model/’.

Returns:

tuple – A tuple containing the best mixed model and a list of results from the optimization.

src.pulsarsa.tuner_optimization.make_best_mixed_model_from_params(f: PipelineTunerLossFunction, best_pca_combi: list, save_model: bool = False, save_path: str = './syn_data/model/')

Creates the best mixed model from the given PCA components and saves it if required.

Parameters:

f (PipelineTunerLossFunction) – An instance of the PipelineTunerLossFunction class that contains the loss function to be minimized.
best_pca_combi (list) – The best PCA components to be used for creating the mixed model.
save_model (bool, optional) – Whether to save the best mixed model or not. Defaults to False.
save_path (str, optional) – Path to save the best mixed model if save_model is True. Defaults to ‘./syn_data/model/’.

Returns:

callable – The best mixed model created from the given PCA components.

src.pulsarsa.tuner_optimization.save_best_mixed_model(best_mixed_model, save_path: str = './syn_data/model/')

Saves the best mixed model’s dynamic parameters (specifically the neural nets) to a specified path.

Parameters:

best_mixed_model (callable) – The best mixed model to be saved.
save_path (str) – The directory where the model should be saved.

src.pulsarsa.tuner_optimization.plot_loss_in_2d_in_pairwise_parameter_combi(res, figsize_per_subplot=4, save_path=None)

Plot all pairwise 2D partial dependence plots of parameters in res on a grid. Adds a colorbar and axis labels for each subplot and shows the figure.

Parameters:

res – skopt result object (must have res.space.dimensions).
figsize_per_subplot – size per subplot in inches (default 4).
save_path – path to save the figure (default None, which means it won’t be saved).

Returns:

matplotlib.figure.Figure object

Module contents

class src.pulsarsa.Payload(freqs: list[float], bandwidths: list[float] | None = None)

Bases: object

This class is used to store radio packet data.

plot(type: str = 'img')

Plots the dataframe

Parameters:: type (str, optional) – _description_. Defaults to ‘img’.
Returns:: plt.axes – returns the axes of the plot

add_flux(radio_packet: list[list[float]])

Parameters:: radio_packet (list[list[float]]) – list of lists, where each list is a row of the dataframe. The first element of the list is the flux and the second element is the frequency.

add_description(description: dict)

Parameters:: description (dict) – dictionary with the keys ‘Pulsar’, ‘NBRFI’, ‘BBRFI’. The values of the dictionary are the number of the pulsar, NBRFI and BBRFI respectively.

assign_bandwidths_to_freqchannels(bandwidths: list[float])

This function assigns bandwidths to the freqs

Parameters:: bandwidths (list[float]) – list of bandwidths.

Note

The bandwidths are assigned to the frequencies in the same order as the frequencies. The bandwidths list length must match the length of the frequencies.

assign_rot_phases(rot_phases: list[float])

This function assigns the rotation phases to the dataframe. The rotation phases are a list of floats

Parameters:: rot_phases (list[float]) – _description_

write_payload_to_jsonfile(file_name: str)

This method writes the payload to a json file.

Parameters:: file_name (str) – path + name of the file to write to

classmethod read_payload_from_jsonfile(filename: str)

This method reads the payload from a json file.

Parameters:: filename (str) – path + name of the file to read from
Returns:: Payload – returns the payload object

class src.pulsarsa.PrepareFreqTimeImage(do_rot_phase_avg: bool = True, do_resize: bool = True, do_binarize: bool = False, resize_size: tuple = (256, 256), binarize_engine: ~src.pulsarsa.preprocessing.BinarizeToMask = <src.pulsarsa.preprocessing.BinarizeToMask object>)

Bases: object

Class to implement methods to load and pre process radio payloads to freq-time image

preparation_protocol(payload: Payload)

Protocol to call methods to prepare freq-time graphs

Parameters:: payload (Payload) – Payload class object made during the simulation
Returns:: image (np.ndarray) – freq-time image

plot(payload_address: str)

plot the freq-time graph loaded from payload file

Parameters:: payload_address (str) – path to the payload file

average_payload_rotphase(payload: Payload)

Method to averge payload to 0-360 rotphase having multiple rotations

Parameters:: payload (Payload) – Payload class object made during the simulation
Returns:: image (np.ndarray) – freq-time image

class src.pulsarsa.ImageReader(file_type: type = <src.pulsarsa.information_packet_formats.Payload object>, resize_size: tuple = (256, 256), do_average: bool = False, do_binarize: bool = False)

Bases: object

Class ImageReader acts as an engine to load/prepare images from payload files or numpy arrays

static read_from_payload(filename: str, resize_size: tuple = (128, 128), do_average: bool = False, do_binarize: bool = False)

Method to load freq-time image from payload files

Parameters:

filename (str) – full address to payload file
resize_size (tuple, optional) – output shape of the loaded image. Defaults to (128, 128).
do_average (bool, optional) – If True then the image is created by averaging phase values over many rotations. Defaults to False.
do_binarize (bool, optional) – If True then the image is binarized. Defaults to False.

Returns:

(np.ndarray) – loaded image

class src.pulsarsa.ImageDataSet(image_tag: str, image_directory: str, image_reader_engine: ~src.pulsarsa.pipeline_methods.ImageReader | ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = <src.pulsarsa.pipeline_methods.ImageReader object>)

Bases: object

This class is used to represent or memory map a set of images

plot(idx)

plots the image from the set represented by the idx

Parameters:: idx (int) – index of the image
Returns:: plt.axis – axis of the plot

class src.pulsarsa.LabelReader(file_type: type = <src.pulsarsa.information_packet_formats.Payload object>)

Bases: object

This class acts as an engine to read the labels from files of type payload

static read_from_payload(filename: str)

Method to read the label from pyload file

Parameters:: filename (str) – full path to the payload file
Returns:: dict – dictionary containing details of the payload file

static read_from_str(label_memomory_map: str): Method to read the label from numpy file

static correct_key_names(description: dict)

class src.pulsarsa.LabelDataSet(image_tag: str, image_directory: str, label_reader_engine: ~src.pulsarsa.pipeline_methods.LabelReader = <src.pulsarsa.pipeline_methods.LabelReader object>)

Bases: object

This class is used to represent or memory map a set of labels of the images

plot(idx)

prints the label of idx image

Parameters:: idx (int) – index representing the image
Returns:: dict – description

class src.pulsarsa.ImageToMaskDataset(image_tag: str, mask_tag: str, image_directory: str, mask_directory: str, image_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': False, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>}, mask_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': True, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>}, device: ~torch.device = device(type='cpu'))

Bases: Dataset

Class to represent Image Mask dataset

plot(index)

Plot image mask pair represented by index

Parameters:: index (int) – index of the pair

class src.pulsarsa.InMaskToMaskDataset(image_tag: str, mask_tag: str, image_directory: str, mask_directory: str, mask_maker_engine: ~src.pulsarsa.pipeline_methods.PipelineImageToMask, image_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': False, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>}, mask_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': True, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>}, device: ~torch.device = device(type='cpu'))

Bases: Dataset

Class to represent InMask Mask dataset

plot(index)

Plot InMask and Mask pair

Parameters:: index (int) – index of the pair to plot

class src.pulsarsa.ImageInMaskToMaskDataset(image_tag: str, mask_tag: str, image_directory: str, mask_directory: str, mask_maker_engine: ~src.pulsarsa.pipeline_methods.PipelineImageToMask, image_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': False, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>}, mask_engine: ~src.pulsarsa.preprocessing.PrepareFreqTimeImage = PrepareFreqTimeImage class object with attributes {'do_rot_phase_avg': True, 'do_resize': True, 'resize_size': (256, 256), 'do_binarize': True, 'binarize_engine': <src.pulsarsa.preprocessing.BinarizeToMask object>}, device: ~torch.device = device(type='cpu'))

Bases: Dataset

Class to represent Image+InMask Mask dataset

plot(index): Plot the 2-channel input (image + in_mask) and the target mask.

class src.pulsarsa.TrainImageToMaskNetworkModel(num_epochs: int, loss_criterion: Module = CustomLossUNet(), store_trained_model_at: str = './syn_data/model/trained_unet_test_v0.pt', model: Module | None = None, train_test_split: float = 0.8, learning_rate: float = 0.001, batch_size: int = 10)

Bases: object

Class involving methods to train Image (or InMask) to Mask Network

Train the network with validation, early stopping, and optional plotting.

Parameters:

image_mask_pairset (ImageToMaskDataset|InMaskToMaskDataset|ImageInMaskToMaskDataset) – Dataset containing image-mask pairs.
early_stopping_patience (int) – Epochs to wait after no val loss improvement.
plot (bool) – Whether to show a loss curve plot. Default is False.

Returns:

(list, list, list) – epochs, training losses, validation losses

test_model(image: tensor, plot_pred: bool = False)

Method to test the network

Parameters:

image (torch.tensor) – image to convert to mask
plot_pred (bool, optional) – If True, plots the prediction from the NN. Defaults to False.

Returns:

(np.ndarray) – prediction by the network as predicted mask

class src.pulsarsa.TrainSignalToLabelModel(num_epochs: int, loss_criterion: Module = BCELoss(), store_trained_model_at: str = './syn_data/model/trained_OneDconvEncoder_test_v0.pt', model: Module | None = None)

Bases: object

Class involving methods to train nn to classify signal into labels

train_model(signal_label_pairset: SignalToLabelDataset)

Method to train the NN

Parameters:: signal_label_pairset (SignalToLabelDataset) – signal label pair dataset
Returns:: (list,list) – epoch number and the loss in each epoch

test_model_from_signal(signal: tensor, plot_pred: bool = False)

Method to test the nn to classify a signal

Parameters:

signal (torch.tensor) – signal to classify
plot_pred (bool, optional) – If True, plots the signal and with prediction probability of each categories. Defaults to False.

Returns:

np.ndarray – Predicted labels are with probabilities

test_model(mask: ndarray, plot_pred: bool = False)

Method to test model from mask

Parameters:

mask (np.ndarray) – mask from which category probability is predicted after generating the signal
plot_pred (bool, optional) – If True plots the results. Defaults to False.

Returns:

np.ndarray – Predicted labels are with probabilities

class src.pulsarsa.UNet(in_channels=1, out_channels=1, init_features=32)

Bases: Module

This model is taken from the original paper of UNet available in PyTorch website. Some modification is done in terms of input and output channels and kernel size.

Parameters:: nn (_type_) – _description_

class src.pulsarsa.FilterCNN(in_channels=1, out_channels=1, init_features=32)

Bases: Module

This is a simple encoder-decoder architecture for filtering the segmented pulsar signals.

class src.pulsarsa.PipelineImageToCCtoLabels(image_to_mask_network: Module, trained_image_to_mask_network_path: str, min_cc_size_threshold: int = 10)

Bases: object

Class implementing methods in sequence to generate segmented freq-time Image, then CC then to determine CCs to categories

get_nns_from_pipeline()

This method returns the neural networks used in the pipeline

Returns:: list – list of neural networks

image_to_mask_method(image: ndarray)

Method to convert image to mask

Parameters:: image (np.ndarray) – image
Returns:: image (np.ndarray) – mask

mask_to_labelled_skeleton_method(mask: ndarray)

Method to make labelled skeleton from mask

Parameters:: mask (np.ndarray) – segmented mask
Returns:: (np.ndarray) – labelled skeleton

labelled_skeleton_to_labels_method(labelled_skeleton: ndarray, return_detailed_results: bool = False)

Method to analyze each labels of the labelled skeleton

Parameters:

labelled_skeleton (np.ndarray) – labelled skeleton
return_detailed_results (bool, optional) – _description_. Defaults to False.

Returns:

results in list of each labels

Plot results of the pipeline with step outputs and comparison with pre-labelled dataset

Parameters:

image_data_set (ImageDataSet) – Image dataset
label_data_set (LabelDataSet) – Label dataset of the images
mask_data_set (ImageDataSet) – Mask dataset of the image dataset
randomize (bool, optional) – If True, randomly chooses images from the dataset. Defaults to True.
ids_toshow (list, optional) – If radomize = False, then choose ids_show from dataset. Defaults to [0, 1].
batch_size (int, optional) – If randomize=True, then chooses batch_size images from set. Defaults to 2.

test_on_real_data_from_npy_files(image_data_set: memmap, image_label_set: memmap | None = None, plot_randomly: bool = True, batch_size: int = 5)

Method to test pipeline on .npy file dataset

Parameters:

image_data_set (np.memmap) – image dataset as numpy array
image_label_set (np.memmap | None, optional) – label dataset as numpy array. Defaults to None.
plot_details (bool, optional) – if True then plot the results. Defaults to False.
plot_randomly (bool, optional) – If True then randomly choose images from dataset. Defaults to True.
batch_size (int, optional) – number of images to test. minimum is 2. Defaults to 5.

measure_accuracy(image_data_set: memmap, label_data_set: memmap, plot_results: bool = False, return_specific_key: str | None = None)

validate_efficiency(image_data_set: ImageDataSet, label_data_set: LabelDataSet, plot_results: bool = True, return_specific_key: str | None = None)

Method to validate efficiency of the pipeline from image and label dataset

Parameters:

image_data_set (ImageDataSet) – image dataset
label_data_set (LabelDataSet) – label dataset
plot_results (bool, optional) – If True then plot the results. Defaults to True.
return_specific_key (str | None, optional) – If provided, returns only the results for this specific key. Defaults to None.

Returns:

tuple – true_scores,true_negative_scores,false_scores,signal_present,f1_scores, precision, recall, accuracy

class src.pulsarsa.PipelineImageToDelGraphtoIsPulsar(image_to_mask_network: Module, trained_image_to_mask_network_path: str, signal_to_label_network: Module, trained_signal_to_label_network: str)

Bases: object

Class implementing methods in sequence to generate segmented freq-time Image then Delay graph then to determine if pulsar is there

get_nns_from_pipeline()

image_to_mask_method(image: ndarray)

Method to convert image to mask

Parameters:: image (np.ndarray) – image
Returns:: image (np.ndarray) – mask

mask_to_signal_method(mask)

Method to convert mask to delaygraph and extract lags as signal

Parameters:: image (np.ndarray) – mask
Returns:: np.ndarray – x_lags as signal

signal_to_label_method(signal: ndarray)

Method to determine if pulsar is there based on signal

Parameters:: signal (np.ndarray) – x_lags as signal
Returns:: float – probability if pulsar is there

validate_efficiency(image_data_set: ImageDataSet, label_data_set: LabelDataSet)

Method to validate efficiency of the pipeline from image and label dataset

Parameters:

image_data_set (ImageDataSet) – image dataset
label_data_set (LabelDataSet) – label dataset

Returns:

float – efficiency measure

Plot results of the pipeline with step outputs and comparison with pre-labelled dataset

Parameters:

image_data_set (ImageDataSet) – Image dataset
mask_data_set (ImageDataSet) – Mask dataset of the image dataset
label_data_set (LabelDataSet) – Label dataset of the images
randomize (bool, optional) – If True, randomly chooses images from the dataset. Defaults to True.
ids_toshow (list, optional) – If radomize = False, then choose ids_show from dataset. Defaults to [0, 1].
batch_size (int, optional) – If randomize=True, then chooses batch_size images from set. Defaults to 2.

test_on_real_data_from_npy_files(image_data_set: memmap, image_label_set: memmap | None = None, plot_details: bool = False, plot_randomly: bool = True, batch_size: int = 5)

Method to test pipeline on .npy file dataset

Parameters:

image_data_set (np.memmap) – image dataset as numpy array
image_label_set (np.memmap | None, optional) – label dataset as numpy array. Defaults to None.
plot_details (bool, optional) – if True then plot the results. Defaults to False.
plot_randomly (bool, optional) – If True then randomly choose images from dataset. Defaults to True.
batch_size (int, optional) – number of images to test. minimum is 2. Defaults to 5.

class src.pulsarsa.PipelineImageToFilterDelGraphtoIsPulsar(image_to_mask_network: Module, trained_image_to_mask_network_path: str, mask_filter_network: Module, trained_mask_filter_network_path: str, signal_to_label_network: Module, trained_signal_to_label_network: str, skip_filter: bool = False)

Bases: object

Class implementing methods in sequence to generate segmented freq-time Image, filter it, then Delay graph then to determine if pulsar is there

skip_filtering(skip_filter: bool = False)

Method to skip filter step

Parameters:: skip_filter (bool, optional) – If True then skip filter step. Defaults to False.

get_nns_from_pipeline()

image_to_mask_method(image: ndarray)

Method to convert image to mask

Parameters:: image (np.ndarray) – image
Returns:: image (np.ndarray) – mask

filter_mask_method(pred_binarized: ndarray)

Method to filter out wrong segments in the segmented mask

Parameters:: pred_binarized (np.ndarray) – segmented mask to filter
Returns:: np.ndarray – filtered segmented mask

mask_to_signal_method(mask)

Method to convert mask to delaygraph and extract lags as signal

Parameters:: image (np.ndarray) – mask
Returns:: np.ndarray – x_lags as signal

signal_to_label_method(signal: ndarray)

Method to determine if pulsar is there based on signal

Parameters:: signal (np.ndarray) – x_lags as signal
Returns:: float – probability if pulsar is there

validate_efficiency(image_data_set: ImageDataSet, label_data_set: LabelDataSet)

Method to validate efficiency of the pipeline from image and label dataset

Parameters:

image_data_set (ImageDataSet) – image dataset
label_data_set (LabelDataSet) – label dataset

Returns:

float – efficiency measure

Plot results of the pipeline with step outputs and comparison with pre-labelled dataset

Parameters:

image_data_set (ImageDataSet) – Image dataset
mask_data_set (ImageDataSet) – Mask dataset of the image dataset
label_data_set (LabelDataSet) – Label dataset of the images
randomize (bool, optional) – If True, randomly chooses images from the dataset. Defaults to True.
ids_toshow (list, optional) – If radomize = False, then choose ids_show from dataset. Defaults to [0, 1].
batch_size (int, optional) – If randomize=True, then chooses batch_size images from set. Defaults to 2.

test_on_real_data_from_npy_files(image_data_set: memmap, image_label_set: memmap | None = None, plot_details: bool = False, plot_randomly: bool = True, batch_size: int = 5)

Method to test pipeline on .npy file dataset

Parameters:

image_data_set (np.memmap) – image dataset as numpy array
image_label_set (np.memmap | None, optional) – label dataset as numpy array. Defaults to None.
plot_details (bool, optional) – if True then plot the results. Defaults to False.
plot_randomly (bool, optional) – If True then randomly choose images from dataset. Defaults to True.
batch_size (int, optional) – number of images to test. minimum is 2. Defaults to 5.

class src.pulsarsa.PipelineImageToFilterToCCtoLabels(image_to_mask_network: Module, trained_image_to_mask_network_path: str, mask_filter_network: Module, trained_mask_filter_network_path: str, min_cc_size_threshold: int = 10, skip_filter: bool = False, min_axis_ratio: float = 4.0, allow_image_as_input_to_filter: bool = False, remove_backgound: bool = True, one_pulse_mode: bool = True, return_snr: bool = False, box_func_window=5, snr_thresh=2, corr_thresh=5)

Bases: object

Class implementing methods in sequence to generate a segmented frequency-time image, filter it, perform connected components (CC) analysis, and categorize the components.

The processing pipeline consists of the following steps:

Image-to-mask conversion using a neural network.
Mask filtering using a neural network.
Mask-to-labeled skeleton conversion using connected components.
Labeled skeleton to category/label identification.

Parameters:

image_to_mask_network (nn.Module) – Neural network to convert image to mask.
trained_image_to_mask_network_path (str) – Path to trained weights of the image-to-mask network.
mask_filter_network (nn.Module) – Neural network to filter mask.
trained_mask_filter_network_path (str) – Path to trained weights of the mask filter network.
min_cc_size_threshold (int) – Minimum size of a connected component to be considered valid.
skip_filter (bool) – If True, skip the filtering step.
min_axis_ratio (float) – Minimum axis ratio of a connected component to be considered valid.
allow_image_as_input_to_filter (bool) – If True, allow the image as input to the filter network.
remove_background (bool) – If True, remove background from the image before processing.
one_pulse_mode (bool) – If True, use one-pulse mode. Clustering and regularization select the best cluster of CCs, which is then identified as a single category.
return_snr (bool) – If True, return SNR in the results.
box_func_window (int) – Window size for boxcar function to align pulses in each channel on the lines from each cluster.
snr_thresh (float) – Threshold for SNR to consider a cluster as a valid pulse.
corr_thresh (float) – Threshold for correlation to identify a valid pulse component in a channel for regularization.

__call__(image: np.ndarray, return_steps: bool = False): Runs the pipeline on the given image.

plot(image: ndarray, return_steps: bool = False)

Returns:: matplotlib.axes.Axes – Axes object with the plots

remove_background(image: ndarray) → ndarray

Method to remove background from the image using gaussian filtering for gradient type background

Parameters:: image (np.ndarray) – input image
Returns:: np.ndarray – image with background removed

skip_filtering(skip_filter: bool = True)

Method to skip filtering step in the pipeline

Parameters:: skip_filter (bool, optional) – If True, then skip filtering step. Defaults to True.

get_nns_from_pipeline()

This method returns the neural networks used in the pipeline

Returns:: list – list of neural networks used in the pipeline

image_to_mask_method(image: ndarray)

Method to convert image to mask

Parameters:: image (np.ndarray) – image
Returns:: image (np.ndarray) – mask

mask_to_labelled_skeleton_method(mask: ndarray)

Method to make labelled skeleton from mask

Parameters:: mask (np.ndarray) – segmented mask
Returns:: (np.ndarray) – labelled skeleton

filter_mask_method(pred_binarized: ndarray)

Method to filter out wrong segments in the segmented mask

Parameters:: pred_binarized (np.ndarray) – segmented mask to filter
Returns:: np.ndarray – filtered segmented mask

filter_imagemask_method(image_pred_binarized: ndarray) → ndarray

Filter out wrong segments in the segmented mask.

Parameters:: image_pred_binarized (np.ndarray) – input with 2 channels (image + in_mask), shape (H, W, 2)
Returns:: np.ndarray – filtered and binarized segmented mask, shape (H, W)

analyse_signal_noise_segement_ratio(binary_filtered_mask: ndarray, thresh: float = 0.3)

Returns:: bool – True if the signal to noise segment ratio is less then threshold, False otherwise

labelled_skeleton_to_labels_method(labelled_skeleton: ndarray, return_detailed_results: bool = False)

Method to analyze each labels of the labelled skeleton

Parameters:

labelled_skeleton (np.ndarray) – labelled skeleton
return_detailed_results (bool, optional) – _description_. Defaults to False.

Returns:

results in list of each labels

labelled_skeleton_to_labels_for_one_pulse_method(labelled_skeleton: ndarray, real_image: ndarray, return_detailed_results: bool = False, return_snr: bool = False, snr_thresh=2, box_func_window=5, signal_length=20, top_n=5, corr_thresh=5)

Plot results of the pipeline with step outputs and comparison with pre-labelled dataset

Parameters:

image_data_set (ImageDataSet) – Image dataset
label_data_set (LabelDataSet) – Label dataset of the images
randomize (bool, optional) – If True, randomly chooses images from the dataset. Defaults to True.
ids_toshow (list, optional) – If radomize = False, then choose ids_show from dataset. Defaults to [0, 1].
batch_size (int, optional) – If randomize=True, then chooses batch_size images from set. Defaults to 2.

test_on_real_data_from_npy_files(image_data_set: memmap, image_label_set: memmap | None = None, plot_randomly: bool = True, batch_size: int = 5, save_plot_path: str | None = None)

Method to test pipeline on .npy file dataset

Parameters:

image_data_set (np.memmap) – image dataset as numpy array
image_label_set (np.memmap | None, optional) – label dataset as numpy array. Defaults to None.
plot_details (bool, optional) – if True then plot the results. Defaults to False.
plot_randomly (bool, optional) – If True then randomly choose images from dataset. Defaults to True.
batch_size (int, optional) – number of images to test. minimum is 2. Defaults to 5.
save_plot_path (str, optional) – If provided saves the plot in that location. Defaults to None

measure_accuracy(image_data_set: memmap, label_data_set: memmap, plot_results: bool = False, return_specific_key: str | None = None)

validate_efficiency(image_data_set: ImageDataSet, label_data_set: LabelDataSet, plot_results: bool = True, return_specific_key: str | None = None)

Method to validate efficiency of the pipeline from image and label dataset

Parameters:

image_data_set (ImageDataSet) – image dataset
label_data_set (LabelDataSet) – label dataset
plot_results (bool, optional) – If True then plot the results. Defaults to True.
return_specific_key (str | None, optional) – If provided, returns only the results for this specific key. Defaults to None.

Returns:

tuple – true_scores,true_negative_scores,false_scores,signal_present,f1_scores, precision, recall, accuracy

class src.pulsarsa.PipelineImageToMask(image_to_mask_network: Module, trained_image_to_mask_network_path: str)

Bases: object

Class implementing methods in sequence to generate segmented freq-time Image from freq-time Image

image_to_mask_method(image: ndarray)

Method to convert image to mask

Parameters:: image (np.ndarray) – image
Returns:: image (np.ndarray) – mask

plot(image: ndarray)

plots the image from the set represented by the idx

Parameters:: image (np.ndarray) – image
Returns:: plt.axis – axis of the plot

class src.pulsarsa.Tuner(sample_of_objects: list[Module | PipelineImageToCCtoLabels | PipelineImageToFilterToCCtoLabels | PipelineImageToFilterDelGraphtoIsPulsar | PipelineImageToDelGraphtoIsPulsar], show_steps: bool = True, reset_components: bool = True, variance_to_capture: float | int = 0.95, all_sliders: bool = False)

Bases: object

Parameters:

sample_of_objects (list) – A list of objects that are either nn.Module or PipelineImageToCCtoLabels or PipelineImageToFilterToCCtoLabels or PipelineImageToFilterDelGraphtoIsPulsar or PipelineImageToDelGraphtoIsPulsar.
show_steps (bool, optional) – If True, it will print the steps of the tuning process. Defaults to True.
reset_components (bool, optional) – If True, it will reset the PCA components to the average scaled PCA factors. Defaults to True.
variance_to_capture (float|int, optional) – The variance to capture in PCA. Defaults to 0.95.
all_sliders (bool, optional) – If True, it will generate sliders for all PCA components. Defaults to False.

generate_mixed_model_from_current_components(): Generates a mixed model obj from the current PCA components and returns it.

get_sample_pca_components(): Returns the sample PCA components used for tuning.

get_current_scaled_pca_factors(): Returns the current scaled PCA factors.

set_scaled_pca_factors(scaled_pca_factors: ndarray): Sets the scaled PCA factors to the given values.

generate_mixed_model_from_input_pca_factors(pca_factors: ndarray)

get_component_ranges()

class src.pulsarsa.PipelineTunerLossFunction(tuner: Tuner, image_dataset, label_dataset, sample_size, weights, save_model: bool = False, save_path: str = './syn_data/model/')

Bases: object

Parameters:

tuner (Tuner) – The Tuner object that contains the PCA engine and other parameters.
image_dataset (np.ndarray) – numpy array of 2d images to be used for testing the mixed model.
label_dataset (np.ndarray) – numpy array of labels corresponding to the images For example [‘Pulsar+NBRFI’,’Pulsar’].
sample_size (int) – Number of samples to randomly select from the dataset for calculating loss.
weights (list) – Weights to be applied to the loss function for each label component.
save_model (bool, optional) – Whether to save the best mixed model or not. Defaults to False.
save_path (str, optional) – Path to save the best mixed model if save_model is True. Defaults to ‘./syn_data/model/’.

Note

The Tuner object should have been initialized with the PCA engine and the number of components to be used for mixing.
If the sample size is smaller than the number of images in the dataset, it will randomly select samples from the dataset which will lead to noise in the loss function. On the other hand, if the sample size is close to the number of images in the dataset, it will lead to a more stable loss function but in case of big datasets it will take a lot of time to calculate the loss function.

catalogue_data_set_containing_pulsar_nbrfi_bbrfi_none(): This function reads the input label_numpy set and catague the index containing pulsar,NBRFI,BBRFI,None

create_normalized_ori_labels(idx: list)

create_normalized_pred_labels(predictor, idx: list)

src.pulsarsa.find_the_best_mixed_pipeline(f: PipelineTunerLossFunction, n_calls=200, n_initial_points=10, minimizer: str = 'gp_minimize', random_seed: int | None = None, niter=1, min_minima: bool = True, save_model: bool = False, save_path: str = './syn_data/model/')

Parameters:

f (PipelineTunerLossFunction) – An instance of the PipelineTunerLossFunction class that contains the loss function to be minimized.
n_calls (int, optional) – Number of calls to the loss function during optimization. Defaults to 200.
n_initial_points (int, optional) – Number of initial points to sample before optimization apart from the x0 points. Defaults to 10.
minimizer (str, optional) – Minimization method to use. Options are ‘gp_minimize’, ‘forest_minimize’. Defaults to ‘gp_minimize’.
random_seed (int, optional) – Random seed for reproducibility. Defaults to None.
niter (int, optional) – Number of iterations to run the optimization. Defaults to 1.
min_minima (bool, optional) – The type of minima to return. If True, returns the minimum minima found. If False, returns the minima closest to the mean of all minima found. Defaults to True.
save_model (bool, optional) – Whether to save the best mixed model or not. Defaults to False.
save_path (str, optional) – Path to save the best mixed model if save_model is True. Defaults to ‘./syn_data/model/’.

Returns:

tuple – A tuple containing the best mixed model and a list of results from the optimization.

src.pulsarsa.save_best_mixed_model(best_mixed_model, save_path: str = './syn_data/model/')

Saves the best mixed model’s dynamic parameters (specifically the neural nets) to a specified path.

Parameters:

best_mixed_model (callable) – The best mixed model to be saved.
save_path (str) – The directory where the model should be saved.

src.pulsarsa.plot_loss_in_2d_in_pairwise_parameter_combi(res, figsize_per_subplot=4, save_path=None)

Plot all pairwise 2D partial dependence plots of parameters in res on a grid. Adds a colorbar and axis labels for each subplot and shows the figure.

Parameters:

res – skopt result object (must have res.space.dimensions).
figsize_per_subplot – size per subplot in inches (default 4).
save_path – path to save the figure (default None, which means it won’t be saved).

Returns:

matplotlib.figure.Figure object

src.pulsarsa.make_best_mixed_model_from_params(f: PipelineTunerLossFunction, best_pca_combi: list, save_model: bool = False, save_path: str = './syn_data/model/')

Creates the best mixed model from the given PCA components and saves it if required.

Parameters:

f (PipelineTunerLossFunction) – An instance of the PipelineTunerLossFunction class that contains the loss function to be minimized.
best_pca_combi (list) – The best PCA components to be used for creating the mixed model.
save_model (bool, optional) – Whether to save the best mixed model or not. Defaults to False.
save_path (str, optional) – Path to save the best mixed model if save_model is True. Defaults to ‘./syn_data/model/’.

Returns:

callable – The best mixed model created from the given PCA components.