growthcurves.inference module#
Utility functions for growth curve analysis.
This module provides utility functions for data validation, smoothing, derivative calculations, and RMSE computation.
- growthcurves.inference.compare_methods(t: ndarray, N: ndarray, model_family: str = 'all', phase_boundary_method: str = None, **fit_kwargs) tuple[source]#
Fit multiple models and extract growth statistics for comparison.
This all-in-one convenience function fits all models in a specified family, extracts their statistics, and returns both for analysis and visualization.
- Parameters:
t (numpy.ndarray) – Time array for fitting
N (numpy.ndarray) – OD N array for fitting
model_family (str, optional) – Which family of models to fit. Options: - “mechanistic” : All mechanistic models (mech_logistic, mech_gompertz, etc.) - “phenomenological” : All phenomenological models - “all” : All available models Default: “all”
phase_boundary_method (str, optional) – Method for calculating phase boundaries (“threshold” or “tangent”). If None, uses default for each model type.
**fit_kwargs – Additional keyword arguments to pass to fitting functions (e.g., spline_s=100, window_points=15)
- Returns:
Returns (fits, stats) where: - fits: Dictionary mapping method names to fit result dictionaries - stats: Dictionary mapping method names to statistics dictionaries The stats dictionary can be passed directly to plot_growth_stats_comparison().
- Return type:
Examples
>>> # Fit and compare all mechanistic models >>> fits, stats = gc.inference.compare_methods(t, N, model_family="mechanistic") >>> >>> # Plot comparison >>> fig = gc.plot.plot_growth_stats_comparison(stats, title="Mechanistic Models") >>> fig.show() >>> >>> # Fit phenomenological models with tangent phase boundaries >>> fits, stats = gc.inference.compare_methods( ... t, N, ... model_family="phenomenological", ... phase_boundary_method="tangent" ... )
- growthcurves.inference.compute_first_derivative(t, N)[source]#
Compute the first derivative of a growth curve.
- Parameters:
t (array_like) – Time array
N (array_like) – OD600 values (baseline-corrected)
- Returns:
Tuple of (t, dN) where dy is the first derivative dy/dt
- Return type:
tuple of (np.ndarray, np.ndarray)
- growthcurves.inference.compute_instantaneous_mu(t, N)[source]#
Compute the instantaneous specific growth rate (μ = 1/N × dN/dt).
The specific growth rate μ represents the rate of population growth per unit of population. It is calculated as μ = (1/N) × d(N)/dt.
- Parameters:
t (array_like) – Time array
N (array_like) – OD600 values (baseline-corrected)
- Returns:
Tuple (t, mu) where mu is the specific growth rate μ = (1/N) × d(N)/dt
- Return type:
tuple of (np.ndarray, np.ndarray)
- growthcurves.inference.compute_mu_max(t, N)[source]#
Calculate the maximum specific growth rate from a fitted curve.
The specific growth rate is μ = (1/N) * dN/dt = d(ln(N))/dt
- Parameters:
t – Time array
y_fit – Fitted OD values
- Returns:
Maximum specific growth rate (t^-1)
- growthcurves.inference.compute_phase_boundaries(t, N, method='tangent', time_at_umax=None, od_at_umax=None, mu_max=None, baseline_od=None, plateau_od=None, lag_threshold=0.15, exp_threshold=0.15)[source]#
Calculate exponential phase boundaries using specified method.
This unified function allows choosing between different methods for determining the start and end of the exponential growth phase.
- Parameters:
t – Time array
N – OD values (should be from fitted/idealized curve)
method – Method to use for phase boundary calculation: - “tangent”: Tangent line method (requires time_at_umax, od_at_umax, mu_max) - “threshold”: Threshold-based method using fractions of μ_max
time_at_umax – Time at which μ_max occurs (required for “tangent” method)
od_at_umax – OD value at time_at_umax (required for “tangent” method)
mu_max – Maximum specific growth rate (required for “tangent” method)
baseline_od – Baseline OD for tangent method (defaults to min(N))
plateau_od – Plateau OD for tangent method (defaults to max(N))
lag_threshold – Fraction of μ_max for lag phase end (threshold method, default: 0.15)
exp_threshold – Fraction of μ_max for exp phase end (threshold method, default: 0.15)
- Returns:
Tuple of (exp_phase_start, exp_phase_end) times.
Examples
>>> # Tangent method (for non-parametric fits) >>> exp_start, exp_end = calculate_phase_boundaries( ... t, N, method="tangent", ... time_at_umax=5.0, od_at_umax=0.5, mu_max=0.8 ... ) >>> # Threshold method (for parametric fits) >>> exp_start, exp_end = calculate_phase_boundaries( ... t, N, method="threshold", lag_threshold=0.15, exp_threshold=0.15 ... )
- growthcurves.inference.compute_phase_boundaries_mu_threshold(t, N, mu_max, lag_threshold=0.15, exp_threshold=0.15)[source]#
Calculate lag and exponential phase end times from specific growth rate.
- Parameters:
t – Time array
N – OD values (should be from fitted/idealized curve)
mu_max – Pre-calculated maximum specific growth rate (required)
lag_threshold – Fraction of μ_max for lag phase end detection
exp_threshold – Fraction of μ_max for exponential phase end detection
- Returns:
Tuple of (lag_end, exp_end) times.
- growthcurves.inference.compute_phase_boundaries_tangent(t, N, time_at_umax, od_at_umax, mu_max, baseline_od=None, plateau_od=None)[source]#
Calculate exponential phase boundaries using tangent line method.
This method extends the tangent line at the point of maximum growth rate (μ_max) down to the baseline (lag phase) and up to the plateau (stationary phase). The intersection points define the start and end of exponential phase.
- At the point of maximum growth rate, the tangent line in LOG space is:
ln(OD(t)) = ln(od_at_umax) + μ_max * (t - time_at_umax)
- Which gives in linear space:
OD(t) = od_at_umax * exp(μ_max * (t - time_at_umax))
- Parameters:
t – Time array
N – OD values (should be from fitted/idealized curve)
time_at_umax – Time at which μ_max occurs
od_at_umax – OD value at time_at_umax
mu_max – Maximum specific growth rate (μ_max)
baseline_od – Baseline OD (lag phase level). If None, uses min(N)
plateau_od – Plateau OD (stationary phase level). If None, uses max(N)
- Returns:
Tuple of (exp_phase_start, exp_phase_end) times.
Note
This method is more appropriate for non-parametric fits where the exponential phase is well-defined by the tangent at μ_max.
- growthcurves.inference.compute_rmse(y_observed, y_predicted, in_log_space=False)[source]#
Calculate root mean square error between observed and predicted values.
- Parameters:
y_observed – Observed values
y_predicted – Predicted values
in_log_space – If True, compute RMSE in log space (default: False)
- Returns:
RMSE value (float), or np.nan if no valid data points
Note
Parametric models use in_log_space=False (linear space)
Non-parametric models (spline, sliding window) use in_log_space=False when data is already log-transformed
- growthcurves.inference.compute_sliding_window_growth_rate(t, N, window_points=15)[source]#
Compute instantaneous specific growth rate using a sliding window approach.
For each t point, fits a linear regression to log(N) vs t in a window centered at that point. The slope of the regression is the instantaneous specific growth rate μ at that t.
This method is more robust to noise than direct differentiation but requires more N points.
- Parameters:
t (array_like) – Time array
N (array_like) – OD600 values (baseline-corrected, must be positive)
window_points (int, optional) – Number of points in each sliding window (default: 15)
- Returns:
Tuple of (time_out, mu_out) where mu_out is the sliding window growth rate. Returns arrays with NaN for points where window fitting failed.
- Return type:
tuple of (np.ndarray, np.ndarray)
- growthcurves.inference.detect_no_growth(t, N, growth_stats=None, min_data_points=5, min_signal_to_noise=5.0, min_od_increase=0.05, min_growth_rate=1e-06)[source]#
Detect whether a growth curve shows no significant growth.
Performs multiple checks to determine if a well should be marked as “no growth”: 1. All OD values are <= 0 2. Insufficient N points 3. Low signal-to-noise ratio (max/min OD ratio) 4. Insufficient OD increase (flat curve) 5. Zero or near-zero growth rate (from fitted stats)
- Parameters:
t – Time array
N – OD values (baseline-corrected)
growth_stats – Optional dict of fitted growth statistics (from extract_stats or sliding_window_fit). If provided, growth rate is checked.
min_data_points – Minimum number of valid N points required (default: 5)
min_signal_to_noise – Minimum ratio of max/min OD values (default: 5.0)
min_od_increase – Minimum absolute OD increase required (default: 0.05)
min_growth_rate – Minimum specific growth rate to be considered growth (default: 1e-6)
- Returns:
is_no_growth: bool, True if no growth detected
reason: str, description of why it was flagged (or “growth detected”)
checks: dict with individual check results
- Return type:
Dict with
- growthcurves.inference.extract_stats(fit_result, t, N, lag_threshold=0.15, exp_threshold=0.15, phase_boundary_method=None, **kwargs)[source]#
Extract growth statistics from parametric or non-parametric fit results.
This function acts as a dispatcher that reads the model type from the fit_result and calls the appropriate model-specific extraction function.
- Parameters:
fit_result – Dict from fit_* functions (contains ‘params’ and ‘model_type’)
t – Time array (hours) used for fitting
N – OD values used for fitting
lag_threshold – Fraction of u_max for lag phase detection (threshold method)
exp_threshold – Fraction of u_max for exponential phase end (threshold method)
phase_boundary_method – Method for calculating phase boundaries: - “threshold”: Threshold-based method using fractions of μ_max - “tangent”: Tangent line method at point of maximum growth rate
- Returns:
Growth statistics dictionary.
- growthcurves.inference.is_no_growth(growth_stats)[source]#
Simple check if growth stats indicate no growth (failed or missing fit).
This is a convenience function for quick checks on growth_stats dicts. For more comprehensive checks including raw data analysis, use detect_no_growth().
- Parameters:
growth_stats – Dict from extract_stats or sliding_window_fit
- Returns:
True if no growth detected (empty stats or zero growth rate)
- Return type:
- growthcurves.inference.smooth(N, window=11, poly=1, passes=2)[source]#
Apply Savitzky-Golay smoothing filter.
- growthcurves.inference.validate_data(t, N, min_points=10)[source]#
Validate and clean input growth curve N.
This function filters out invalid N points that would cause problems in growth curve analysis, particularly when taking logarithms for exponential growth calculations.
Filters out: - Non-finite values (NaN, inf, -inf) in t or OD - Non-positive OD values (N <= 0), which are invalid for log transformations - Datasets with insufficient N points or no t variation
- Parameters:
t – Time array
N – OD measurement array
min_points – Minimum number of valid N points required (default: 10)
- Returns:
Tuple of (time_clean, data_clean) with only valid N points, or (None, None) if the N doesn’t meet minimum requirements.
Examples
>>> t = np.array([0, 1, 2, 3, 4]) >>> N = np.array([0.0, 0.1, 0.2, np.nan, 0.4]) # Has zero and NaN >>> time_clean, data_clean = validate_data(t, N) >>> print(time_clean) [1 2 4] >>> print(data_clean) [0.1 0.2 0.4]