Checks for fitting signals¶

This file contains functions that check if two signals fit or not. They can be used to check a gluing or molecular fit regions.

fit_checks.check_correlation(first_signal, second_signal, threshold=None)[source]¶

Returns the correlation coefficient between the two signals.

The signals can be either 1D arrays or 2D arrays containing the rolling slices of the input signals. In the 2D case, the function returns the sliding correlation between the original signals.

If a threshold is provided, returns True if the correlation is above the specified threshold.

Parameters

first_signal: array: The first signal array
second_signal: array: The second signal array
threshold: float or None: Threshold for the correlation coefficient.

Returns

correlation: float or boolean: If threshold is None, then the function returns an the correlation coefficient. If a threshold is provided, the function returns True if the correlation value is above the threshold.

fit_checks.check_linear_fit_intercept_and_correlation(first_signal, second_signal)[source]¶

Check if the intercept of a linear fit is near zero, and the correlation coefficient of the two signals.

Performs a linear fit to the data, assuming y = ax + b, with x the first_signal and y the second_signal. It will return the value np.abs(b / np.mean(y) * 100)

If the intercept is far from zero, it indicates that the two signals do not differ from a multiplication constant.

Parameters

first_signalarray: The first signal array
second_signalarray: The second signal array

Returns

intercept_percentfloat or boolean: The value of the intercept b, relative to the mean value of the second_signal.
correlationfloat: Correlation coefficient between the two samples

fit_checks.check_min_max_ratio(first_signal, second_signal, threshold=None)[source]¶

Returns the ration between minimum and maximum values (i.e. min / max).

The operation is performed for both signals and the minimum is returned. The aim is to detect regions of large variation e.g. edges of clouds. Similar large values will be returned when the signals are near 0, so the relative difference is large. Consequently, this test should be used in parallel with checks e.g. about signal to noise ratio.

If a threshold is provided, returns True if the reltio is above the specified threshold.

Parameters

first_signal: array: The first signal array
second_signal: array: The second signal array
threshold: float or None: Threshold for the correlation coefficient.

Returns

minmax: float or boolean: If threshold is None, then the function returns the min/max ratio. If a threshold is provided, the function returns True if the correlation value is above the threshold.

fit_checks.check_residuals_not_gaussian(first_signal, second_signal, threshold=None)[source]¶

Check if the residuals of the linear fit are not from a normal distribution.

The function uses a Shapiro-Wilk test on the residuals of a linear fit. Specifically, the function performs a linear fit to the data, assuming y = ax, and then calculates the residuals r = y - ax. It will return the p value of the Shapiro-Wilk test on the residuals.

If a threshold is provided, returns True if the p value is below the specified threshold, i.e. if the residuals are probably not gaussian.

Parameters

first_signal: array: The first signal array
second_signal: array: The second signal array
threshold: float or None: Threshold for the Shapiro-Wilk p-value.

Returns

p_value: float or boolean: If threshold is None, then the function returns the p-value of the Shapiro-Wilk test on the residuals. If a threshold is provided, the function returns True if p-value is below the threshold.

fit_checks.check_residuals_not_gaussian_dagostino(first_signal, second_signal, threshold=None)[source]¶

Check if the residuals of the linear fit are not from a normal distribution.

The function uses a D’agostino - Pearsons’s test on the residuals of a linear fit. Specifically, the function performs a linear fit to the data, assuming y = ax, and then calculates the residuals r = y - ax. It will return the p value of the D’agostino - Pearsons’s omnibus test on the residuals.

If a threshold is provided, returns True if the p value is below the specified threshold, i.e. if the residuals are probably not gaussian.

Parameters

first_signal: array: The first signal array
second_signal: array: The second signal array
threshold: float or None: Threshold for the Shapiro-Wilk p-value.

Returns

p_value: float or boolean: If threshold is None, then the function returns the p-value of the D’agostino - Pearsons’s test on the residuals. If a threshold is provided, the function returns True if p-value is below the threshold.

fit_checks.sliding_check_correlation(first_signal, second_signal, window_length=11, threshold=None)[source]¶

Returns the sliding correlation coefficient between the two signals.

If a threshold is provided, returns True if the correlation is above the specified threshold.

Parameters

first_signal: array: The first signal array
second_signal: array: The second signal array
window_length: int: The length of the window. It should be an odd number.
threshold: float or None: Threshold for the correlation coefficient.

Returns

correlation: float or boolean: If threshold is None, then the function returns an the correlation coefficient. If a threshold is provided, the function returns True if the correlation value is above the threshold.

fit_checks.sliding_check_linear_fit_intercept_and_correlation(first_signal, second_signal, window_length=11)[source]¶

Check if the intercept of a linear fit is near zero.

Performs a linear fit to the data, assuming y = ax + b, with x the first_signal and y the second_signal.

It will return the value np.abs(b / np.mean(y) * 100) and the correlation of the two signals.

Parameters

first_signal: array: The first signal array
second_signal: array: The second signal array
window_length: int: The length of the window. It should be an odd number.

Returns

interceptsfloat or boolean: The value of the intercept b, relative to the mean value of the second_signal.
correlationsfloat: Correlation coefficient between the two samples

fit_checks.sliding_check_min_max_ratio(first_signal, second_signal, window_length=11, threshold=None)[source]¶

Returns the sliding min/max ratio for both signals

If a threshold is provided, returns True if the min/max ratio is above the specified threshold.

Parameters

first_signal: array: The first signal array
second_signal: array: The second signal array
window_length: int: The length of the window. It should be an odd number.
threshold: float or None: Threshold for the correlation coefficient.

Returns

correlation: float or boolean: If threshold is None, then the function returns an the correlation coefficient. If a threshold is provided, the function returns True if the correlation value is above the threshold.

fit_checks.sliding_check_residuals_not_gaussian(first_signal, second_signal, window_length, threshold=None)[source]¶

Check if the residuals of the linear fit are not from a normal distribution.

The function uses a Shapiro-Wilk test on the residuals of a linear fit. Specifically, the function performs a linear fit to the data, assuming y = ax, and then calculates the residuals r = y - ax. It will return the p value of the Shapiro-Wilk test on the residuals.

If a threshold is provided, returns True if the p value is below the specified threshold, i.e. if the residuals are probably not gaussian.

Parameters

first_signal: array: The first signal array
second_signal: array: The second signal array
window_length: int: The length of the window. It should be an odd number.
threshold: float or None: Threshold for the Shapiro-Wilk p-value.

Returns

p_value: array: If threshold is None, then the function returns the p-value of the Shapiro-Wilk test on the residuals. If a threshold is provided, the function returns True if p-value is below the threshold.

fit_checks.sliding_check_residuals_not_gaussian_dagostino(first_signal, second_signal, window_length, threshold=None)[source]¶

Check if the residuals of the linear fit are not from a normal distribution.

The function uses a Shapiro-Wilk test on the residuals of a linear fit. Specifically, the function performs a linear fit to the data, assuming y = ax, and then calculates the residuals r = y - ax. It will return the p value of the Shapiro-Wilk test on the residuals.

If a threshold is provided, returns True if the p value is below the specified threshold, i.e. if the residuals are probably not gaussian.

Parameters

first_signal: array: The first signal array
second_signal: array: The second signal array
window_length: int: The length of the window. It should be an odd number.
threshold: float or None: Threshold for the Shapiro-Wilk p-value.

Returns

p_value: array: If threshold is None, then the function returns the p-value of the Shapiro-Wilk test on the residuals. If a threshold is provided, the function returns True if p-value is below the threshold.