Algorithms
- At the moment, the following algorithms are supported.
MOTMLE (Fits the MOT number from CMOS images)
Peak (Displays a peak in the SSD pulse data)
PeakFinder (Finds peaks in the SSD pulse data)
- class data_eng_utokyo.algorithms.MOTMLE(c, references: list, do_subtract_dead_pixels: bool = True, dead_pixel_percentile: float = 5.0)[source]
Bases:
objectApplies Maximum Likelihood Estimation to extract the MOT number from an image.
- Parameters:
c – Lookup for the constants
references (list[str]) – List of files (images) which are to be used as reference for subtracting dead pixels.
do_subtract_dead_pixels (bool) – Should we guess and subtract the dead pixels before the fitting and plotting.
dead_pixels_percentile (float) – Guess of the fraction of dead pixels in the image.
dead_pixel_percentile (float)
Example
from data_eng_utokyo.analysis import MOTMLE mot_mle = MOTMLE(c=c_ccd, references=[], do_subtract_dead_pixels=False) perform_analysis = mot_mle.perform_analysis perform_analysis( source="path_to_image_file.xlsx", target="visualization.png", mode="mot number", min_signal=0, time="1st of January 2000 at 1 p.m." )
- references[source]
List of files (images) which are to be used as reference for subtracting dead pixels.
- Type:
list[str]
- do_subtract_dead_pixels[source]
Should we guess and subtract the dead pixels before the fitting and plotting.
- Type:
bool
- perform_analysis(source: str, target: str, mode: str, min_signal: int = 0, time: str = 'unknown time')[source]
Executes the fitting for a single image.
Loads the image data, fits a 2D gaussian model on it, generates a plot of the original data and a fit, saves the plot, and returns the statistics of the fit.
Source is the filepath of the original data and target is the filepath of the plot. The mode can be either ‘power’ or ‘mot number’. If the total sum of the df is less than min_signal, then we terminate the analysis.
- Parameters:
source (str) – Filepath of the image file.
target (str) – Filepath of the plot we want to create.
mode (str) – Either ‘power’ or ‘mot number’, depending on what observable we want to fit.
min_signal (int) – Threshold, when the sum of the image is less than this, then we skip the image.
time (str) – Time at which the image was taken, will be added to the plot.
- Returns:
- Lookup of the results. Contains at least the keys “fit_successful”, “total_sum”,
”enough_pulses”, and more when the fit is successful.
- Return type:
statistics (dict)
- _load(source: str) DataFrame[source]
Load the pandas dataframe and the right constants.
- Parameters:
source (str) – Filepath to image.
- Returns:
The image as pandas dataframe.
- Return type:
DataFrame
- _df_to_array(df: DataFrame) array[source]
Takes the image as df and returns it as np.array.
- Parameters:
df (pd.DataFrame) – The dataframe representing the image.
- Returns:
Image as np.array
- Return type:
array
- _precalculate_dead_pixels()[source]
Calculates a heuristic for finding the dead pixels.
Takes a list of reference images, and finds the dead pixels by calculating the ratio standard deviation / max(average, 1) of the same pixel across the reference images. The dead pixels are the ones with the lowest std. Sets the member variables that are later used in the method _subtract_dead_pixels().
The assumption is that dead pixels have high median of this value. So dead pixels are the ones with the smallest ratio 1 / max(median, 1).
Stores the array that represents the mean value that the dead pixels have, with the healthy pixels set to zero.
- _plot_dead_pixels(signal_mean, ratio, signal_std, dead_pixels)[source]
Create heatmaps of the signal mean, ratio, std and estimated dead pixels.
- Parameters:
signal_mean (np.array) – Mean calculated signal as given by _precalculate_dead_pixels() method.
ratio (np.array) – See _precalculate_dead_pixels() method.
signal_std (np.array) – See _precalculate_dead_pixels() method.
dead_pixels (np.array) – See _precalculate_dead_pixels() method.
- _subtract_dead_pixels(data: dict)[source]
Subtracts the values of the dead pixels from the z-values of the data.
Replaces the value with 0 if they become negative. Modifies the argument of the method, as dicts are passed by reference.
- Parameters:
data (dict) – Lookup of the data with keys x, y, z and arrays are values.
- _preprocess(df: DataFrame, mode: str) dict[source]
Takes the image data as pandas dataframe and converts into numpy arrays. Converts the unit of the z-axis.
The conversion of the z axis is based on the setup constants and the mode.
- Parameters:
df (pd.DataFrame) – The dataframe representing the image.
mode (str) – Either ‘power’ or ‘mot number’, depending on what observable we want to fit.
- Returns:
Lookup representing the data with keys x, y, and z. Values are np.arrays.
- Return type:
dict
- _get_scaling_factor(mode: str) float[source]
Loads a physical scaling parameter depending on the mode.
The unit of the z axis can be converted using a scaling factor. This function returns the scaling factor, which can be determined from the mode, which is either ‘power’ or ‘mot number’.
- Returns:
The scaling factor as float.
- Parameters:
mode (str)
- Return type:
float
- _fitting(model: callable, data: dict, mode: str)[source]
Fits the model to the data.
- Parameters:
model (callable) – Model to be fitted.
data (dict) – Lookup of the data.
mode (str) – Either ‘power’ or ‘mot number’. Changes the initial guess of the fitting procedure.
- Returns:
Lookup with the statistics of the fit.
- _extract_statistics(r_squared, chi2, popt, pcov, perr, signal_sum)[source]
Convert the fit results to a convenient lookup.
- Parameters:
r_squared
chi2
popt (np.array) – Optimal parameters.
pcov (np.array) – Covariance matrix of the optimal fit parameters. Used to get the uncertainty of the fit.
perr
signal_sum – Sum of the array to be fitted.
- _get_initial_guess(data: dict, mode: str)[source]
Proposes the initial guesses for the fitting parameters with heuristics.
- Parameters:
data (dict) – Data as lookup table.
mode (str) – Either ‘power’ or ‘mot number’. Changes the initial guess of the fitting procedure.
- _generate_fit_data(model: callable, data: dict, statistics: dict, df: DataFrame)[source]
Takes the x, y values of the data and the fit parameter, and returns fitted z values.
Does this on a x, y grid in the same format as the data.
- Args:
model (callable): Model to be fitted. data (dict): Lookup of the data. statistics (dict): Lookup of the statistics of the fit result.
- Returns:
Fit data as lookup in the same format as the original data. Has three keys x, y, z, corresponding to the coordinate x, y, and the fitted z value, respectively.
- Parameters:
model (callable)
data (dict)
statistics (dict)
df (DataFrame)
- _plot_fit_result(data: dict, fit_data: dict, target: str, mode: str, time: str)[source]
Plots the 3d data and the fit. Saves the image to the url.
- Parameters:
data (dict) – Original data.
fit_data (dict) – Fitted data.
target (str) – Filename of the plot which is to be created and saved.
mode (str) – Either ‘power’ or ‘mot number’. Changes the initial guess of the fitting procedure.
time (str) – Time when the image was taken. Is added to the title of the plot.
- _plot_heatmap(data: dict, fit_data, target: str, mode: str, time: str, df: DataFrame)[source]
Plots the 3d data and the fit as heatmap. Saves the image to the url.
- Parameters:
data (dict) – Original data.
fit_data (dict) – Fitted data.
target (str) – Filename of the plot which is to be created and saved.
mode (str) – Either ‘power’ or ‘mot number’. Changes the initial guess of the fitting procedure.
time (str) – Time when the image was taken. Is added to the title of the plot.
df (DataFrame)
- class data_eng_utokyo.algorithms.Peak(timestamp: int, events: list, background: float)[source]
Bases:
objectRepresents a peak of the SSD data and can perform analysis on itself.
- Parameters:
timestamp (int) – Time of the peak.
events (list) – List of pulses making up the peak.
background (float) – Pulse rate [1/s] representing the background.
- pulses_background[source]
Expected number of background pulses in the time interval provided by the events.
- Type:
int
- pulses_peak[source]
Excess of pulses. How many more pulses do we see as expected by the background.
- Type:
int
- estimate()[source]
Calculates the half-life time with MLE.
The result can be derived with MLE by adapting the reasoning from here: https://math.stackexchange.com/questions/101481/calculating-maximum-likelihood-estimation-of-the-exponential-distribution-and-pr
- plot(url: str)[source]
Visualizes the data and fit and saves the image to the url.
- Parameters:
url (str) – Name of the file to which the plot should be saved.
- class data_eng_utokyo.algorithms.PeakFinder(recorder, plot_filename: str = '')[source]
Bases:
objectClass for finding peaks in the SSD data, which come from the release of many atoms by heating the Yttrium.
- Parameters:
plot_filename (str)
- get_new_peaks(df) list[source]
Loads the new data, estimates the background, finds the peaks and returns them.
- Return type:
list
- _get_new_data(df) DataFrame[source]
Returns the part of the data which has not been processed yet.
- Return type:
DataFrame
- _calculate_new_background(new_df: DataFrame)[source]
Update the estimation of the background based on the new data.
- Parameters:
new_df (DataFrame)
- _find_peaks(new_df: DataFrame) list[source]
Uses a sliding window approach to find times (peaks) in which many signals were recorded. Returns the peaks as a list of timestamps.
- Parameters:
new_df (DataFrame)
- Return type:
list
- _find_peaks_in_1d_array(array: list, timestamps: list)[source]
Takes the array and finds local maxima which have at least a certain time difference.
- Parameters:
array (list)
timestamps (list)
- _find_maximum(df) float[source]
Generates a histogram of the timestamps with 50 bins. Returns the timestamp of the left side of the bin with the highest count.
- Return type:
float
- _generate_peaks(df: DataFrame, peak_timestamps: list) list[source]
Takes the new data and a list of the timestamps of the new peaks, and builds a list of the Peak instances.
- Parameters:
df (DataFrame)
peak_timestamps (list)
- Return type:
list