src.pulsarsa.tools.data_loaders package

Submodules

src.pulsarsa.tools.data_loaders.real_data_loaders module

class src.pulsarsa.tools.data_loaders.real_data_loaders.FilDataSource(data_path: str, metadata_path: str | None = None, position_column: str = 'position', time_window: int = 256, allow_randomness: bool = False)

Bases: object

Data source class to extract dispersion graphs from a .fil file based on provided metadata. If metadata is provided, it extracts pulse positions from the specified column using the __getitem__ method.

get_metadata(key)

Extract metadata information for the position index represented by the key

Parameters:

key (int) – index representing the row of the position column in metadata

Returns:

pandas.core.series.Series – returns the metadata row as a pandas Series

load_data()

Reads the header of fildata. This method is called during initialization.

gen_random_background_at_index(key: int, plot: bool = False)

Generates random background noise based on mean and std deviation of the dispersion graph at the specified index

Parameters:
  • key (int) – index representing the row of the position column in metadata

  • plot (bool, optional) – whether to plot the generated background noise graph. Defaults to False.

Returns:

np.ndarray – generated random background noise graph

extract_a_dispersion_graph(position: int, time_window: int, plot: bool = False)

Extracts the dispersion graph at position in the filterbank file

Parameters:
  • position (int) – position representing the timebin of the filterbank file

  • time_window (int) – width of the time window to extract

  • plot (bool, optional) – whether to plot the extracted dispersion graph. Defaults to False.

Returns:

np.ndarray – extracted dispersion graph

get_num_time_bins()

Calculates the total number of time bins in the filterbank file :returns: int – total number of time bins

get_signal_gaps_from_metadata()

Identifies positions in the metadata where there are large gaps between signals based on the time_window

Raises:

RuntimeError – If metadata is not provided

Returns:

np.ndarray – positions where large gaps are detected

Module contents

class src.pulsarsa.tools.data_loaders.FilDataSource(data_path: str, metadata_path: str | None = None, position_column: str = 'position', time_window: int = 256, allow_randomness: bool = False)

Bases: object

Data source class to extract dispersion graphs from a .fil file based on provided metadata. If metadata is provided, it extracts pulse positions from the specified column using the __getitem__ method.

get_metadata(key)

Extract metadata information for the position index represented by the key

Parameters:

key (int) – index representing the row of the position column in metadata

Returns:

pandas.core.series.Series – returns the metadata row as a pandas Series

load_data()

Reads the header of fildata. This method is called during initialization.

gen_random_background_at_index(key: int, plot: bool = False)

Generates random background noise based on mean and std deviation of the dispersion graph at the specified index

Parameters:
  • key (int) – index representing the row of the position column in metadata

  • plot (bool, optional) – whether to plot the generated background noise graph. Defaults to False.

Returns:

np.ndarray – generated random background noise graph

extract_a_dispersion_graph(position: int, time_window: int, plot: bool = False)

Extracts the dispersion graph at position in the filterbank file

Parameters:
  • position (int) – position representing the timebin of the filterbank file

  • time_window (int) – width of the time window to extract

  • plot (bool, optional) – whether to plot the extracted dispersion graph. Defaults to False.

Returns:

np.ndarray – extracted dispersion graph

get_num_time_bins()

Calculates the total number of time bins in the filterbank file :returns: int – total number of time bins

get_signal_gaps_from_metadata()

Identifies positions in the metadata where there are large gaps between signals based on the time_window

Raises:

RuntimeError – If metadata is not provided

Returns:

np.ndarray – positions where large gaps are detected