Checks for fitting signals¶
This file contains functions that check if two signals fit or not. They can be used to check a gluing or molecular fit regions.
- fit_checks.check_correlation(first_signal, second_signal, threshold=None)[source]¶
Returns the correlation coefficient between the two signals.
The signals can be either 1D arrays or 2D arrays containing the rolling slices of the input signals. In the 2D case, the function returns the sliding correlation between the original signals.
If a threshold is provided, returns True if the correlation is above the specified threshold.
- Parameters
- first_signal: array
The first signal array
- second_signal: array
The second signal array
- threshold: float or None
Threshold for the correlation coefficient.
- Returns
- correlation: float or boolean
If threshold is None, then the function returns an the correlation coefficient. If a threshold is provided, the function returns True if the correlation value is above the threshold.
- fit_checks.check_linear_fit_intercept_and_correlation(first_signal, second_signal)[source]¶
Check if the intercept of a linear fit is near zero, and the correlation coefficient of the two signals.
Performs a linear fit to the data, assuming y = ax + b, with x the first_signal and y the second_signal. It will return the value np.abs(b / np.mean(y) * 100)
If the intercept is far from zero, it indicates that the two signals do not differ from a multiplication constant.
- Parameters
- first_signalarray
The first signal array
- second_signalarray
The second signal array
- Returns
- intercept_percentfloat or boolean
The value of the intercept b, relative to the mean value of the second_signal.
- correlationfloat
Correlation coefficient between the two samples
- fit_checks.check_min_max_ratio(first_signal, second_signal, threshold=None)[source]¶
Returns the ration between minimum and maximum values (i.e. min / max).
The operation is performed for both signals and the minimum is returned. The aim is to detect regions of large variation e.g. edges of clouds. Similar large values will be returned when the signals are near 0, so the relative difference is large. Consequently, this test should be used in parallel with checks e.g. about signal to noise ratio.
If a threshold is provided, returns True if the reltio is above the specified threshold.
- Parameters
- first_signal: array
The first signal array
- second_signal: array
The second signal array
- threshold: float or None
Threshold for the correlation coefficient.
- Returns
- minmax: float or boolean
If threshold is None, then the function returns the min/max ratio. If a threshold is provided, the function returns True if the correlation value is above the threshold.
- fit_checks.check_residuals_not_gaussian(first_signal, second_signal, threshold=None)[source]¶
Check if the residuals of the linear fit are not from a normal distribution.
The function uses a Shapiro-Wilk test on the residuals of a linear fit. Specifically, the function performs a linear fit to the data, assuming y = ax, and then calculates the residuals r = y - ax. It will return the p value of the Shapiro-Wilk test on the residuals.
If a threshold is provided, returns True if the p value is below the specified threshold, i.e. if the residuals are probably not gaussian.
- Parameters
- first_signal: array
The first signal array
- second_signal: array
The second signal array
- threshold: float or None
Threshold for the Shapiro-Wilk p-value.
- Returns
- p_value: float or boolean
If threshold is None, then the function returns the p-value of the Shapiro-Wilk test on the residuals. If a threshold is provided, the function returns True if p-value is below the threshold.
- fit_checks.check_residuals_not_gaussian_dagostino(first_signal, second_signal, threshold=None)[source]¶
Check if the residuals of the linear fit are not from a normal distribution.
The function uses a D’agostino - Pearsons’s test on the residuals of a linear fit. Specifically, the function performs a linear fit to the data, assuming y = ax, and then calculates the residuals r = y - ax. It will return the p value of the D’agostino - Pearsons’s omnibus test on the residuals.
If a threshold is provided, returns True if the p value is below the specified threshold, i.e. if the residuals are probably not gaussian.
- Parameters
- first_signal: array
The first signal array
- second_signal: array
The second signal array
- threshold: float or None
Threshold for the Shapiro-Wilk p-value.
- Returns
- p_value: float or boolean
If threshold is None, then the function returns the p-value of the D’agostino - Pearsons’s test on the residuals. If a threshold is provided, the function returns True if p-value is below the threshold.
- fit_checks.sliding_check_correlation(first_signal, second_signal, window_length=11, threshold=None)[source]¶
Returns the sliding correlation coefficient between the two signals.
If a threshold is provided, returns True if the correlation is above the specified threshold.
- Parameters
- first_signal: array
The first signal array
- second_signal: array
The second signal array
- window_length: int
The length of the window. It should be an odd number.
- threshold: float or None
Threshold for the correlation coefficient.
- Returns
- correlation: float or boolean
If threshold is None, then the function returns an the correlation coefficient. If a threshold is provided, the function returns True if the correlation value is above the threshold.
- fit_checks.sliding_check_linear_fit_intercept_and_correlation(first_signal, second_signal, window_length=11)[source]¶
Check if the intercept of a linear fit is near zero.
Performs a linear fit to the data, assuming y = ax + b, with x the first_signal and y the second_signal.
It will return the value np.abs(b / np.mean(y) * 100) and the correlation of the two signals.
- Parameters
- first_signal: array
The first signal array
- second_signal: array
The second signal array
- window_length: int
The length of the window. It should be an odd number.
- Returns
- interceptsfloat or boolean
The value of the intercept b, relative to the mean value of the second_signal.
- correlationsfloat
Correlation coefficient between the two samples
- fit_checks.sliding_check_min_max_ratio(first_signal, second_signal, window_length=11, threshold=None)[source]¶
Returns the sliding min/max ratio for both signals
If a threshold is provided, returns True if the min/max ratio is above the specified threshold.
- Parameters
- first_signal: array
The first signal array
- second_signal: array
The second signal array
- window_length: int
The length of the window. It should be an odd number.
- threshold: float or None
Threshold for the correlation coefficient.
- Returns
- correlation: float or boolean
If threshold is None, then the function returns an the correlation coefficient. If a threshold is provided, the function returns True if the correlation value is above the threshold.
- fit_checks.sliding_check_residuals_not_gaussian(first_signal, second_signal, window_length, threshold=None)[source]¶
Check if the residuals of the linear fit are not from a normal distribution.
The function uses a Shapiro-Wilk test on the residuals of a linear fit. Specifically, the function performs a linear fit to the data, assuming y = ax, and then calculates the residuals r = y - ax. It will return the p value of the Shapiro-Wilk test on the residuals.
If a threshold is provided, returns True if the p value is below the specified threshold, i.e. if the residuals are probably not gaussian.
- Parameters
- first_signal: array
The first signal array
- second_signal: array
The second signal array
- window_length: int
The length of the window. It should be an odd number.
- threshold: float or None
Threshold for the Shapiro-Wilk p-value.
- Returns
- p_value: array
If threshold is None, then the function returns the p-value of the Shapiro-Wilk test on the residuals. If a threshold is provided, the function returns True if p-value is below the threshold.
- fit_checks.sliding_check_residuals_not_gaussian_dagostino(first_signal, second_signal, window_length, threshold=None)[source]¶
Check if the residuals of the linear fit are not from a normal distribution.
The function uses a Shapiro-Wilk test on the residuals of a linear fit. Specifically, the function performs a linear fit to the data, assuming y = ax, and then calculates the residuals r = y - ax. It will return the p value of the Shapiro-Wilk test on the residuals.
If a threshold is provided, returns True if the p value is below the specified threshold, i.e. if the residuals are probably not gaussian.
- Parameters
- first_signal: array
The first signal array
- second_signal: array
The second signal array
- window_length: int
The length of the window. It should be an odd number.
- threshold: float or None
Threshold for the Shapiro-Wilk p-value.
- Returns
- p_value: array
If threshold is None, then the function returns the p-value of the Shapiro-Wilk test on the residuals. If a threshold is provided, the function returns True if p-value is below the threshold.