src.pulsarsa.tools.data_loaders package
Submodules
src.pulsarsa.tools.data_loaders.real_data_loaders module
- class src.pulsarsa.tools.data_loaders.real_data_loaders.FilDataSource(data_path: str, metadata_path: str | None = None, position_column: str = 'position', time_window: int = 256, allow_randomness: bool = False)
Bases:
objectData source class to extract dispersion graphs from a .fil file based on provided metadata. If metadata is provided, it extracts pulse positions from the specified column using the __getitem__ method.
- get_metadata(key)
Extract metadata information for the position index represented by the key
- Parameters:
key (int) – index representing the row of the position column in metadata
- Returns:
pandas.core.series.Series – returns the metadata row as a pandas Series
- load_data()
Reads the header of fildata. This method is called during initialization.
- gen_random_background_at_index(key: int, plot: bool = False)
Generates random background noise based on mean and std deviation of the dispersion graph at the specified index
- Parameters:
key (int) – index representing the row of the position column in metadata
plot (bool, optional) – whether to plot the generated background noise graph. Defaults to False.
- Returns:
np.ndarray – generated random background noise graph
- extract_a_dispersion_graph(position: int, time_window: int, plot: bool = False)
Extracts the dispersion graph at position in the filterbank file
- Parameters:
position (int) – position representing the timebin of the filterbank file
time_window (int) – width of the time window to extract
plot (bool, optional) – whether to plot the extracted dispersion graph. Defaults to False.
- Returns:
np.ndarray – extracted dispersion graph
- get_num_time_bins()
Calculates the total number of time bins in the filterbank file :returns: int – total number of time bins
- get_signal_gaps_from_metadata()
Identifies positions in the metadata where there are large gaps between signals based on the time_window
- Raises:
RuntimeError – If metadata is not provided
- Returns:
np.ndarray – positions where large gaps are detected
Module contents
- class src.pulsarsa.tools.data_loaders.FilDataSource(data_path: str, metadata_path: str | None = None, position_column: str = 'position', time_window: int = 256, allow_randomness: bool = False)
Bases:
objectData source class to extract dispersion graphs from a .fil file based on provided metadata. If metadata is provided, it extracts pulse positions from the specified column using the __getitem__ method.
- get_metadata(key)
Extract metadata information for the position index represented by the key
- Parameters:
key (int) – index representing the row of the position column in metadata
- Returns:
pandas.core.series.Series – returns the metadata row as a pandas Series
- load_data()
Reads the header of fildata. This method is called during initialization.
- gen_random_background_at_index(key: int, plot: bool = False)
Generates random background noise based on mean and std deviation of the dispersion graph at the specified index
- Parameters:
key (int) – index representing the row of the position column in metadata
plot (bool, optional) – whether to plot the generated background noise graph. Defaults to False.
- Returns:
np.ndarray – generated random background noise graph
- extract_a_dispersion_graph(position: int, time_window: int, plot: bool = False)
Extracts the dispersion graph at position in the filterbank file
- Parameters:
position (int) – position representing the timebin of the filterbank file
time_window (int) – width of the time window to extract
plot (bool, optional) – whether to plot the extracted dispersion graph. Defaults to False.
- Returns:
np.ndarray – extracted dispersion graph
- get_num_time_bins()
Calculates the total number of time bins in the filterbank file :returns: int – total number of time bins
- get_signal_gaps_from_metadata()
Identifies positions in the metadata where there are large gaps between signals based on the time_window
- Raises:
RuntimeError – If metadata is not provided
- Returns:
np.ndarray – positions where large gaps are detected