Share This

Data Drift for Condition Based Monitoring

In our previous blog series on data analysis and ML development for sensor data, we discussed data preprocessing and how vibration data can be effectively used for condition monitoring (see Preprocessing Vibration Data for Machine Learning, & Detecting Anomalies in Belt-Driven Systems with Machine Learning). 

In this post, we focus on how changes in acceleration distribution can reveal shifts in machine behavior. To explore this, we use the same small test rig from the previous blogs to simulate different operating conditions. For transparency, we collect all data using our own enDAQ data logger, which provides high‑quality, high‑sample‑rate acceleration measurements.

We then use this data to track data drift over time and detect changes in machine condition. This approach can help identify early signs of misalignment, wear, looseness, or other abnormal situations — ultimately preventing progressive damage, major failures, and costly downtime.

 This blog includes: 

  1. Introduction to data drift
  2. Test Design and Data Acquisition
  3. Acceleration Visualization
  4. Acceleration Distribution
  5. KDE and Probability Distribution Distances
  6. Condition Monitoring
  7. Summary and conclusion
  8. Resources, Scripts, and Files  


Introduction to Data Drift 

In industrial condition monitoring, we often assume that a machine operating under “normal” conditions will continue to produce similar data over time. In reality, this assumption rarely holds. Machines age, loads change, components wear, and operating environments evolve. As a result, the statistical properties of the data collected from sensors, such as vibration, temperature, or pressure, gradually or abruptly change. This phenomenon is known as data drift. 

Data drift refers to a change in the underlying distribution of measured data over time. Even when a machine appears to be functioning normally, its sensor signals may slowly shift due to mechanical, structural, or operational changes. These shifts are not necessarily faults, but they often indicate that the system is moving away from its original baseline state. 

In the context of condition monitoring, data drift is not a problem; it is a signal. Detecting and quantifying data drift allows engineers to track changes in machine behavior long before traditional alarm thresholds are crossed. 

  1. Why Data Drift Matters in Condition Monitoring 

Traditional condition monitoring approaches typically rely on fixed thresholds applied to scalar features such as RMS, peak amplitude, or kurtosis. While effective for detecting severe faults, these methods are often insensitive to gradual changes. Data drift, on the other hand, captures subtle distributional changes in sensor signals that may correspond to early-stage degradation, loss of stiffness, misalignment, or changes in load transmission. 

By monitoring data drift: 

  • Early degradation can be detected before catastrophic failure 
  • Machine behavior can be tracked continuously, rather than classified as simply “healthy” or “faulty” 
  • Condition monitoring systems can adapt to changes in operating conditions 

In essence, data drift provides a continuous measure of deviation from normal behavior, rather than a binary decision. 

  1. Data Drift vs. Faults 

It is important to distinguish data drift from faults. A fault is typically associated with a discrete mechanical failure, such as a cracked gear tooth or a damaged bearing. Data drift, however, describes a gradual or progressive change in data distributions that may or may not lead to a fault. 

For example: 

  • Increasing belt slack may cause vibration distributions to widen 
  • Structural loosening may shift dominant vibration amplitudes 
  • Load redistribution may change how energy is transmitted through the machine 

These effects manifest as data drift well before they become faults. 

  1. A Distribution-Based View of Machine Health 

Instead of monitoring individual signal values, data-drift-based condition monitoring focuses on entire signal distributions. By comparing the probability distributions of sensor data collected at different times, engineers can quantify how much the machine behavior has changed relative to a reference state. 

Metrics such as: 

  • Jensen–Shannon divergence, 
  • Hellinger distance, and 
  • Wasserstein (Earth Mover’s) distance 

provide mathematically rigorous ways to measure these changes. When applied to vibration signals, these metrics offer a powerful framework for detecting early degradation in a way that is robust to noise, sensor orientation, and machine-specific characteristics. 

In the sections that follow, we will explore how data drift can be quantified using distribution-based metrics, how these metrics relate to physical changes in machines, and how they can be applied to real-world condition monitoring problems without relying on machine-specific thresholds. 


Test design and data acquisition  

A representative test rig was designed, consisting of two pulleys: a drive pulley and a driven pulley. The driven pulley is adjustable and can be moved toward the drive pulley to intentionally loosen the belt, while the drive pulley remains fixed. The test rig is shown in Figure 1. 

Figure 1. The designed test rig which includes adjustable driven pulley and fixed drive pulley. 

In this blog, a sampling rate of 4,000 Hz was used. The data logger was mounted on the rigid frame of the test rig and Z direction is normal to the ground, as shown in Figure 2. For each test, the rig was first powered on, after which the data logger was activated to record acceleration data for 60 seconds. Once data acquisition was complete, the data logger was switched off, followed by the shutdown of the test rig. 

 A picture containing indoor

AI-generated content may be incorrect.

Figure 2. Data logger mounted on the rigid base. 

The vibration data were collected under three operating conditions: healthy, where the belt tightness is normall (Figure 3 a); relatively loose, where the belt tightness was reduced by approximately 75% (Figure 3 b); and too loose, where the belt tightness was significantly (50%) reduced (Figure 3 c). The relatively loose condition is more representative of real-world operation and is therefore the primary focus for evaluating whether data drift can be detected under realistic fault conditions. The too loose condition is less likely to occur in practice but is included to illustrate how data drift manifests under severe belt loosening. 

 

 

(a) 

(b) 

(c) 

Figure 3. Collecting vibration data for (a) healthy, (b) 75% loose, and (c) 50% loose conditions. 


Acceleration Visualization 

Visualizing raw acceleration time series can sometimes provide insightful indications of the system’s condition. In this section, the acceleration in the X, Y, and Z directions is plotted, as shown in Figure 4 , using the code below: 

import pandas as pd 

import plotly.graph_objects as go 

import endaq 

from plotly.subplots import make_subplots 

import plotly.express as px 

 

# Updated: each label maps to a list of file paths 

datasets = { 

    "Healthy": [ 

        r'C:\Users\hdabiri\OneDrive - Hutchinson Industries Inc\HAMED\Marketing\3.DataDrift\IDE\DAQ16837_000603_Healthy.ide', 

        # Add more Healthy files if needed 

    ], 

     

    "75% Loose": [ 

        r'C:\Users\hdabiri\OneDrive - Hutchinson Industries Inc\HAMED\Marketing\3.DataDrift\IDE\DAQ16837_000606_75Loose.ide' 

    ], 

     

    "50% Loose": [ 

        r'C:\Users\hdabiri\OneDrive - Hutchinson Industries Inc\HAMED\Marketing\3.DataDrift\IDE\DAQ16837_000604_50Loose.ide' 

    ], 

     

    # "Detension": [ 

    #     r'C:\Users\hdabiri\OneDrive - Hutchinson Industries Inc\HAMED\Marketing\3.DataDrift\IDE\DAQ16837_000605_detension.ide' 

    # ], 

     

} 

 

# Column names to plot 

columns_plot = ["X (40g)", "Y (40g)", "Z (40g)"] 

 

# Assign consistent colors to each class label 

color_palette = px.colors.qualitative.Set1 

label_list = list(datasets.keys()) 

label_colors = {label: color_palette[i % len(color_palette)] for i, label in enumerate(label_list)} 

 

# Plotting function 

def plot_axes_combined(datasets, columns_plot, label_colors): 

    fig = make_subplots( 

        rows=3, 

        cols=1, 

        shared_xaxes=True, 

        vertical_spacing=0.05, 

        # subplot_titles=columns_plot 

    ) 

 

    # For each label and list of file paths 

    for label, file_paths in datasets.items(): 

        for idx, path in enumerate(file_paths): 

            try: 

                # Load and preprocess 

                doc = endaq.ide.get_doc(path) 

                accel = endaq.ide.to_pandas(doc.channels[80], time_mode='seconds') 

                accel = accel - accel.median()  # Remove DC offset 

 

                for i, axis in enumerate(columns_plot): 

                    if axis in accel.columns: 

                        fig.add_trace( 

                            go.Scatter( 

                                x=accel.index, 

                                y=accel[axis], 

                                mode='lines', 

                                name=label if idx == 0 and i == 0 else None,  # show legend only once per label 

                                legendgroup=label, 

                                line=dict(color=label_colors[label]), 

                                showlegend=(idx == 0 and i == 0) 

                            ), 

                            row=i + 1, col=1 

                        ) 

                    else: 

                        print(f"[{label}] Column '{axis}' not found in {path}") 

 

            except Exception as e: 

                print(f"Error loading file {path}: {e}") 

 

    # Add axis labels 

    for i, axis in enumerate(columns_plot): 

        fig.update_yaxes(title_text=axis, row=i+1, col=1) 

 

    fig.update_xaxes(title_text="Time (seconds)", row=3, col=1) 

    # fig.update_layout( 

    #     height=500, 

    #     title_text="Acceleration Signals from All Datasets (X, Y, Z)", 

    #     showlegend=True 

    # ) 

 

    fig.update_layout( 

    height=400, 

    width=1100, 

    title_text="Acceleration Signals from All Datasets (X, Y, Z)", 

    showlegend=True, 

    legend=dict( 

        orientation="h",         # Horizontal legend 

        yanchor="bottom", 

        y=1.02,                  # Slightly above the top of the plot 

        xanchor="center", 

        x=0.5,                   # Centered horizontally 

        traceorder="grouped", 

        title_text='',           # Optional: remove legend title 

        itemsizing='constant', 

        valign='top' 

        ), 

        legend_tracegroupgap=0,     # No space between grouped traces 

        margin=dict(t=100)          # Extra top margin to fit the legend 

    ) 

 

 

    fig.show() 

 

# Run the plot function 

plot_axes_combined(datasets, columns_plot, label_colors) 


A picture containing timeline

AI-generated content may be incorrect.

Figure 4. Acceleration timeseries for X, Y, and Z directions. 

The healthy condition shows the widest spread, indicating higher vibration energy. As belt gets looser, the vibration amplitude visibly reduces across all axes. Therefore it is expected the distribution becomes narrower as the belt gets looser. Vibration distribution and KDE are provided and discussed in next sections.  


Acceleration Distribution  

General data preprocessing steps are described in detail in the blog post “Preprocessing Vibration Data for Machine Learning” (link). In this section, we focus on visualizing and analyzing the statistical distributions of acceleration signals measured along the X, Y, and Z axes. 

Rather than examining raw time-domain signals, we analyze their distributions, which provide a compact statistical description of how acceleration values are spread over time. Distribution-based analysis is particularly useful for detecting subtle changes in machine behavior that may not be apparent in individual time traces. 

histogram is a graphical representation of the distribution of a dataset. It shows how frequently different ranges (or bins) of values occur. In the context of vibration analysis, a histogram describes how often specific acceleration amplitudes appear during operation. 

Changes in a dynamic system condition can alter the shape, width, or symmetry of these histograms, even when traditional metrics such as RMS remain relatively unchanged. 

The script below extracts acceleration data in the X, Y, and Z directions and plots their corresponding histograms (Figure 5). In addition to the individual axes, a resultant acceleration is computed to provide a direction-independent representation of vibration energy:

By including the resultant acceleration, we reduce sensitivity to sensor orientation and mounting direction, which is particularly important when comparing measurements across different machines or sensor locations. 

import pandas as pd 

import numpy as np 

import plotly.graph_objects as go 

from plotly.subplots import make_subplots 

import plotly.express as px 

import endaq 

 

# ===================================================== 

# Dataset dictionary 

# ===================================================== 

datasets = { 

    "Healthy": [ 

        r'C:\Users\hdabiri\OneDrive - Hutchinson Industries Inc\HAMED\Marketing\3.DataDrift\IDE\DAQ16837_000603_Healthy.ide', 

        # Add more Healthy files if needed 

    ], 

     

    "75% Loose": [ 

        r'C:\Users\hdabiri\OneDrive - Hutchinson Industries Inc\HAMED\Marketing\3.DataDrift\IDE\DAQ16837_000606_75Loose.ide' 

    ], 

     

    "50% Loose": [ 

        r'C:\Users\hdabiri\OneDrive - Hutchinson Industries Inc\HAMED\Marketing\3.DataDrift\IDE\DAQ16837_000604_50Loose.ide' 

    ], 

     

    # "Detension": [ 

    #     r'C:\Users\hdabiri\OneDrive - Hutchinson Industries Inc\HAMED\Marketing\3.DataDrift\IDE\DAQ16837_000605_detension.ide' 

    # ], 

} 

 

# ===================================================== 

# Parameters 

# ===================================================== 

columns = ["X (40g)", "Y (40g)", "Z (40g)", "Resultant"] 

 

requested_start = 5 

requested_end = 55 

 

# Colors 

label_colors = { 

    "Healthy": "#d62728", 

    "75% Loose": "#1f77b4", 

    "50% Loose": "#2ca02c", 

} 

 

# ===================================================== 

# Step 1: Find common overlapping time window 

# ===================================================== 

time_ranges = {} 

 

for label, file_list in datasets.items(): 

    for file_path in file_list: 

        doc = endaq.ide.get_doc(file_path) 

        accel = endaq.ide.to_pandas(doc.channels[80], time_mode='seconds') 

        time_ranges[file_path] = (accel.index.min(), accel.index.max()) 

 

max_start = max(t[0] for t in time_ranges.values()) 

min_end = min(t[1] for t in time_ranges.values()) 

 

common_start = max(max_start, requested_start) 

common_end = min(min_end, requested_end) 

 

if common_start >= common_end: 

    raise ValueError("❌ No common time window across all datasets.") 

 

print(f"✅ Using common time window: {common_start:.2f}s → {common_end:.2f}s") 

 

# ===================================================== 

# Step 2: Create subplots 

# ===================================================== 

fig = make_subplots( 

    rows=1, 

    cols=len(columns), 

    subplot_titles=columns, 

    shared_yaxes=True 

) 

 

# ===================================================== 

# Step 3: Load, trim, and plot histograms 

# ===================================================== 

for label, file_list in datasets.items(): 

    color = label_colors[label] 

    legend_shown = False 

 

    for file_path in file_list: 

        try: 

            doc = endaq.ide.get_doc(file_path) 

            accel = endaq.ide.to_pandas(doc.channels[80], time_mode='seconds') 

 

            # Detrend (median removal) 

            accel = accel - accel.median() 

 

            # Enforce common time window 

            accel = accel[ 

                (accel.index >= common_start) & 

                (accel.index <= common_end) 

            ] 

 

            # Compute resultant 

            accel["Resultant"] = np.sqrt( 

                accel["X (40g)"]**2 + 

                accel["Y (40g)"]**2 + 

                accel["Z (40g)"]**2 

            ) 

 

            # Plot histograms 

            for j, col in enumerate(columns): 

                fig.add_trace( 

                    go.Histogram( 

                        x=accel[col], 

                        nbinsx=50, 

                        opacity=0.5, 

                        marker=dict(color=color), 

                        name=label if not legend_shown else None, 

                        legendgroup=label, 

                        showlegend=not legend_shown, 

                        histnorm="probability density"  # comparable PDFs 

                    ), 

                    row=1, 

                    col=j + 1 

                ) 

                legend_shown = True 

 

        except Exception as e: 

            print(f"[{label}] Failed to process {file_path}: {e}") 

 

# ===================================================== 

# Step 4: Layout 

# ===================================================== 

fig.update_layout( 

    height=500, 

    width=1100, 

    title_text=( 

        "Overlayed Histograms of Acceleration " 

        f"(Common {common_start:.1f}–{common_end:.1f}s)" 

    ), 

    barmode="overlay" 

) 

 

fig.update_xaxes(title_text="Acceleration (g)") 

fig.update_yaxes(title_text="Probability Density") 

 

fig.show() 

 

# ===================================================== 

# Step 5: Save figure 

# ===================================================== 

# Replace the path below with the path of your folder (r"...\histograms_common_window.html") 

output_path = r"C:\Users\hdabiri\OneDrive - Hutchinson Industries Inc\HAMED\Marketing\3.DataDrift\histograms_common_window.html" 

 

fig.write_html(output_path) 

 

print(f"✅ Figure saved to: {output_path}") 
   

Chart, histogram

AI-generated content may be incorrect.

Figure 5. Histograms of acceleration for X, Y, Z, directions as well as resultant. 

Comparing acceleration distributions across different machine condition allows us to observe how vibration behavior evolves as the system condition changes. These distributional changes form the basis for data-drift metrics such as Jensen–Shannon divergence, Hellinger distance, and Wasserstein distance, which are discussed in the following sections. 

Based on the histograms in Figure 5, reducing the belt tightness causes the distributions in the X, Y, and Z directions to become narrower. In the resultant plot, the distribution not only narrows but also shifts to the left.  


KDE and Probability Distribution Distances

Kernel Density Estimation (KDE) 

Kernel Density Estimation (KDE) is a non-parametric method used to estimate the underlying probability density function of a random variable from finite data samples. Unlike histograms, KDE provides a smooth and continuous representation of the distribution without requiring arbitrary binning choices. In vibration analysis, KDE enables comparison of signal amplitude distributions across operating conditions by preserving subtle shape differences. These estimated probability density functions form the basis for computing distribution-based distance and divergence metrics such as Jensen–Shannon divergence, Hellinger distance, and Wasserstein distance. 

  1. Jensen–Shannon Divergence (JSD) 

The Jensen–Shannon divergence is an information-theoretic measure that quantifies the similarity between two probability distributions. Unlike the Kullback–Leibler divergence, JSD is symmetric and always finite, making it well suited for practical condition monitoring applications.  Given two distributions P and Q, JSD is defined as: 

where KLdenotes the Kullback–Leibler divergence. JSD increases as the vibration behavior deviates from the reference condition. 

Hellinger Distance 

The Hellinger distance is a statistical metric that measures the geometric separation between two probability distributions. It is bounded between 0 and 1, which makes it easy to interpret and compare across different systems. For discrete probability distributions 

and Q, the Hellinger distance is defined as: 

The Hellinger distance is particularly robust to noise and small probability values, making it suitable for vibration-based condition monitoring. 

Wasserstein Distance 

The Wasserstein distance, also known as the Earth Mover’s Distance, measures the minimum “cost” required to transform one probability distribution into another. Unlike divergence-based metrics, it accounts for the actual distance between values on the amplitude axis. In one dimension, it can be expressed as: 

where FP and Fare the cumulative distribution functions of P and QThis metric is particularly sensitive to shifts in vibration amplitude caused by changes such as belt looseness. 

The table below compares the mentioned metrics: 

Metric 

Interpretability 

Sensitivity to Change 

Robustness 

Notes for Condition Monitoring 

Jensen–Shannon Divergence (JSD) 

Medium 

High 

High 

Detects subtle distribution shape changes; bounded and symmetric; well suited for tracking gradual drift 

Hellinger Distance 

High 

Medium–High 

Very High 

Bounded ([0,1]); less sensitive to noise and outliers; stable across different machines 

Wasserstein Distance 

High 

Very High 

Medium 

Sensitive to amplitude shifts; physically intuitive; may be influenced by operating load changes 

 

Practical Interpretation 

  • JSD is effective when small but meaningful changes occur in the vibration distribution shape. 
  • Hellinger provides the most stable behavior across sensors, machines, and operating conditions. 
  • Wasserstein is the most physically intuitive metric but can be affected by global amplitude scaling. 

Condition Monitoring 

In this section, the probability distribution distance metrics introduced in the previous section are computed using the collected vibration data for three operating conditions created on the test rig: healthy, relatively loose, and too loose. The resulting values are then analyzed to evaluate how they can be used for condition monitoring. The script below computes distribution-based distance metrics for each operating condition, presents the results in tabular form, and visualizes the corresponding kernel density estimation (KDE) curves for the X, Y, and Z acceleration components, as well as the resultant acceleration. 

import pandas as pd 

import numpy as np 

import plotly.graph_objects as go 

from plotly.subplots import make_subplots 

from scipy.stats import gaussian_kde, entropy 

from scipy.stats import gaussian_kde, entropy, wasserstein_distance 

import endaq 

 

# ===================================================== 

# Dataset dictionary 

# ===================================================== 

datasets = { 

    "100": [ 

        r'C:\Users\hdabiri\OneDrive - Hutchinson Industries Inc\HAMED\Marketing\3.DataDrift\IDE\DAQ16837_000603_Healthy.ide', 

        # Add more Healthy files if needed 

    ], 

     

    "75": [ 

        r'C:\Users\hdabiri\OneDrive - Hutchinson Industries Inc\HAMED\Marketing\3.DataDrift\IDE\DAQ16837_000606_75Loose.ide' 

    ], 

     

    "50": [ 

        r'C:\Users\hdabiri\OneDrive - Hutchinson Industries Inc\HAMED\Marketing\3.DataDrift\IDE\DAQ16837_000604_50Loose.ide' 

    ], 

     

    # "Detension": [ 

    #     r'C:\Users\hdabiri\OneDrive - Hutchinson Industries Inc\HAMED\Marketing\3.DataDrift\IDE\DAQ16837_000605_detension.ide' 

    # ], 

 

} 

 

# ===================================================== 

# Parameters 

# ===================================================== 

columns = ["X (40g)",  "Y (40g)", "Z (40g)"]    #"X (40g)", "Y (40g)", "Z (40g)" 

start_time = 5 

end_time = 55 

resultant = True 

trim_start = 0.05 

trim_end = 0.05 

 

# ===================================================== 

# Helper: Jensen–Shannon Divergence 

# ===================================================== 

def jensen_shannon_divergence(p, q): 

    p = np.array(p) / np.sum(p) 

    q = np.array(q) / np.sum(q) 

    m = 0.5 * (p + q) 

    return 0.5 * (entropy(p, m) + entropy(q, m)) 

 

# ===================================================== 

# Step 1: Load all data and find common overlap 

# ===================================================== 

accels_raw = {} 

time_ranges = {} 

 

for label, files in datasets.items(): 

    for file in files: 

        doc = endaq.ide.get_doc(file) 

        accel = endaq.ide.to_pandas(doc.channels[80], time_mode='seconds') 

        accel = accel - accel.median() 

        accels_raw[label] = accel 

        time_ranges[label] = (accel.index.min(), accel.index.max()) 

        print(f"[{label}] Time range: {accel.index.min():.2f}s to {accel.index.max():.2f}s " 

              f"({accel.index.max() - accel.index.min():.2f}s)") 

 

# Find the maximum common overlap 

max_start = max(r[0] for r in time_ranges.values()) 

min_end = min(r[1] for r in time_ranges.values()) 

 

# Enforce the requested start/end times if possible 

common_start = max(max_start, start_time) 

common_end = min(min_end, end_time) 

print(f"\n✅ Using common time window: {common_start:.2f}s to {common_end:.2f}s") 

 

# Define a shared time grid 

N = 10000 

common_time = np.linspace(common_start, common_end, N) 

 

# ===================================================== 

# Step 2: Compute KDEs on the same time grid 

# ===================================================== 

kdes = {} 

x_ranges = {} 

all_axes = columns.copy() 

if resultant: 

    all_axes.append("Resultant") 

 

for axis in all_axes: 

    kdes[axis] = {} 

    all_x = [] 

 

    for label, accel in accels_raw.items(): 

        # Interpolate to the common grid 

        accel_interp = accel.reindex(accel.index.union(common_time)).interpolate('index').loc[common_time] 

 

        # Compute resultant if needed 

        if resultant and axis == "Resultant": 

            if all(col in accel_interp.columns for col in columns): 

                accel_interp["Resultant"] = np.sqrt( 

                    accel_interp[columns[0]]**2 + 

                    accel_interp[columns[1]]**2 + 

                    accel_interp[columns[2]]**2 

                ) 

            else: 

                print(f"[{label}] Missing axes for resultant.") 

                continue 

 

        if axis not in accel_interp.columns: 

            continue 

 

        # Extract and trim 

        data = accel_interp[axis].dropna().values 

        n = len(data) 

        if n == 0: 

            continue 

        start_idx = int(trim_start * n) 

        end_idx = int(n * (1 - trim_end)) 

        data = data[start_idx:end_idx] 

 

        if len(data) < 10: 

            print(f"[{label}] Too little data after trimming for {axis}.") 

            continue 

 

        kde = gaussian_kde(data) 

        kdes[axis][label] = kde 

        all_x.extend(data) 

 

    if len(all_x) > 0: 

        x_ranges[axis] = np.linspace(min(all_x), max(all_x), 1000) 

    else: 

        x_ranges[axis] = np.linspace(-1, 1, 1000) 

 

 

 

# ===================================================== 

# Step 3: Compute JSD, max amplitude, JSD/MaxAmp, normalized JSD, 

# and other distances (KLD, Hellinger, Wasserstein) 

# ===================================================== 

all_results = {} 

 

results_rows = [] 

 

for axis in all_axes: 

    jsd_results = {} 

    kld_results = {} 

    hellinger_results = {} 

    wasserstein_results = {} 

    max_amp_dict = {} 

 

    # Reference KDE (largest label) only used for JSD calculation 

    ref_label = max(kdes[axis].keys(), key=int) 

    ref_kde = kdes[axis][ref_label] 

    x = x_ranges[axis] 

    ref_pdf = ref_kde(x) 

 

    # Compute distances and max amplitude 

    for label, kde in kdes[axis].items(): 

        pdf = kde(x) 

        # --- JSD --- 

        jsd_val = jensen_shannon_divergence(ref_pdf, pdf) 

        jsd_results[label] = jsd_val 

        # # --- KLD --- 

        p = np.where(ref_pdf == 0, 1e-12, ref_pdf) 

        q = np.where(pdf == 0, 1e-12, pdf) 

        # kld_results[label] = entropy(p, q) 

        # --- Hellinger --- 

        hellinger_results[label] = np.sqrt(np.sum((np.sqrt(p) - np.sqrt(q))**2)) / np.sqrt(2) 

        # --- Wasserstein --- 

        wasserstein_results[label] = wasserstein_distance(x, x, u_weights=p, v_weights=q) 

 

        # --- Max amplitude --- 

        if axis == "Resultant": 

            if all(col in accels_raw[label].columns for col in columns): 

                accel_data = np.sqrt( 

                    accels_raw[label][columns[0]]**2 + 

                    accels_raw[label][columns[1]]**2 + 

                    accels_raw[label][columns[2]]**2 

                ) 

            else: 

                accel_data = pd.Series(dtype=float) 

        else: 

            accel_data = accels_raw[label].get(axis, pd.Series(dtype=float)) 

 

        n = len(accel_data) 

        if n > 0: 

            start_idx = int(trim_start * n) 

            end_idx = int(n * (1 - trim_end)) 

            data_trimmed = accel_data.iloc[start_idx:end_idx].values 

            max_amp = np.max(np.abs(data_trimmed)) if len(data_trimmed) > 0 else 0.0 

        else: 

            max_amp = 0.0 

        max_amp_dict[label] = max_amp 

 

    # Sort items by JSD ascending 

    sorted_items = sorted(jsd_results.items(), key=lambda x: x[1]) 

    labels_sorted = [item[0] for item in sorted_items] 

    jsd_sorted = [item[1] for item in sorted_items] 

 

    # Lowest and highest JSD for normalized JSD 

    lowest_jsd = jsd_sorted[0] 

    highest_jsd = jsd_sorted[-1] 

 

    # Compute ΔJSD vs lowest and normalized JSD 

    delta_jsd_from_lowest = [val - lowest_jsd for val in jsd_sorted] 

    normalized_jsd = [(val - lowest_jsd) / (highest_jsd - lowest_jsd) if highest_jsd != lowest_jsd else 0.0 for val in jsd_sorted] 

 

    # --- Print results --- 

    print(f"\n=== Axis: {axis} ===") 

    print(f"Reference for PDF: {ref_label}") 

    print(f"{'Label':>6} |  {'%ΔTension':>12} | {'JSD':>10} | {'Hellinger':>10} | {'WassDist':>10}")   # {'KLD':>10} | 

    print("-" * 70)  

 

    for i, label in enumerate(labels_sorted): 

        tension = float(label) 

        pct_tension = (tension - float(ref_label)) / float(ref_label) * 100 

        max_amp = max_amp_dict[label] 

        jsd_over_amp = jsd_sorted[i] / max_amp if max_amp != 0 else 0.0 

     

        print(f"{label:>6} | {pct_tension:12.2f}% | {jsd_sorted[i]:10.6f} | " 

              f"{hellinger_results[label]:10.6f} | {wasserstein_results[label]:10.6f}") 

     

        results_rows.append({ 

            "Axis": axis, 

            "Label": label, 

            "Tension_%": tension, 

            "JSD": jsd_sorted[i], 

            "Hellinger": hellinger_results[label], 

            "Wasserstein": wasserstein_results[label], 

        }) 

 

# ===================================================== 

# Step 4: Plot KDE Distributions 

# ===================================================== 

# Assign one consistent color per dataset 

import plotly.express as px 

 

labels = sorted(kdes[all_axes[0]].keys(), key=int) 

base_colors = px.colors.qualitative.Plotly 

color_map = {label: base_colors[i % len(base_colors)] for i, label in enumerate(labels)} 

 

kde_fig = make_subplots( 

    rows=1, 

    cols=len(all_axes), 

    subplot_titles=all_axes, 

    shared_yaxes=True 

) 

 

for i, axis in enumerate(all_axes): 

    x = x_ranges[axis] 

    for label, kde in sorted(kdes[axis].items(), key=lambda x: int(x[0])): 

        y = kde(x) 

        kde_fig.add_trace( 

            go.Scatter( 

                x=x, y=y, 

                mode='lines', 

                name=f"{label}", 

                line=dict(color=color_map[label]), 

                legendgroup=label, 

                showlegend=(i == 0)   # legend only in first subplot 

            ), 

            row=1, col=i + 1 

        ) 

 

 

kde_fig.update_layout( 

    height=500, 

    width=250 * len(all_axes), 

    title_text=f"KDE of Vibration Signals (Common {common_start:.1f}–{common_end:.1f}s)
" f"(Trim {trim_start*100:.0f}% start, {trim_end*100:.0f}% end)" ) kde_fig.update_xaxes(title_text="Acceleration (g)") kde_fig.update_yaxes(title_text="Density") kde_fig.show() # ===================================================== # Step 5: Save figure # ===================================================== kde_output_path = ( r"C:\Users\hdabiri\OneDrive - Hutchinson Industries Inc\HAMED\Marketing\3.DataDrift" r"\kde_vibration_common_window.html" ) kde_fig.write_html(kde_output_path) print(f"✅ KDE plot saved to: {kde_output_path}") # ===================================================== # Step 6: Save results as one csv table # ===================================================== results_df = pd.DataFrame(results_rows) csv_output_path = ( r"C:\Users\hdabiri\OneDrive - Hutchinson Industries Inc\HAMED\Marketing\3.DataDrift" r"\kde_distance_metrics.csv" ) results_df.to_csv(csv_output_path, index=False) print(f"✅ Metrics table saved to: {csv_output_path}")

Chart, histogram

AI-generated content may be incorrect.

Figure 6. KDE vibration signals for X, Y, Z directions as well as resultant. 

Based on the KDE curves shown in Figure 6, it can be observed that as belt gets looser, the KDE distributions become narrower, whereas tighter belt corresponds to wider distributions. This effect is most pronounced in the vertical vibration axis (Z). Another key observation from the KDEs is that, when the belt gets loosedr, the resultant acceleration distribution not only becomes narrower but also shifts to the left, indicating a decrease in acceleration amplitude when the belt is loose. 

Table I. JSD, Hellinger, and Wasserstein metrics for X, Y, Z and resultant accelerations. 

Axis 

Label 

Tension_% 

JSD 

Hellinger 

Wasserstein 

X (40g) 

100 

100 

0 

0 

0 

75 

75 

0.010055 

7.648881 

0.003601 

50 

50 

0.035609 

14.69657 

0.006962 

Y (40g) 

100 

100 

0 

0 

0 

50 

50 

0.008285 

5.856887 

0.003937 

75 

75 

0.011525 

6.960527 

0.005344 

Z (40g) 

100 

100 

0 

0 

0 

75 

75 

0.052921 

10.98605 

0.023472 

50 

50 

0.111831 

16.3282 

0.034014 

Resultant 

100 

100 

0 

0 

0 

75 

75 

0.055266 

14.30971 

0.024039 

50 

50 

0.130089 

23.21788 

0.033132 

 

Based on JSD, Hellinger, and Wasserstein metrics for different belt tensions provided in Table I: 

X and Y directions: 

In the X direction (vertical to the belt length), all metrics increase gradually as belt becomes looser, indicating moderate sensitivity to looseness. This suggests that reduced belt stiffness introduces additional vertical vibration, but the overall distribution remains relatively close to the healthy state. In contrast, the Y direction (along the belt) exhibits the smallest divergence values across all metrics. Vibrations along the belt length are therefore weakly affected by belt tightness, making this axis less informative for detecting belt looseness. 

Z direction: 

The Z direction (normal to the ground) shows a pronounced increase in all metrics with looseness, indicating strong sensitivity to belt dynamics and significant changes in vibration amplitude distribution. 

Resultant acceleration: 

The resultant signal exhibits the largest divergence values and a clear monotonic trend, demonstrating its effectiveness in aggregating multi-axis vibration changes into a robust fault indicator. 


Summary and conclusion 

Condition monitoring is essential for predictive maintenance, enabling early detection of degradation before unplanned downtime or failures occur. Vibration signals are particularly informative, as changes in stiffness, alignment, or load often appear as subtle shifts long before visible damage. 

Using probability distribution distance metrics—such as Jensen–Shannon divergence, Hellinger distance, and Wasserstein distance—offers a powerful, distribution-based view of vibration behavior, capturing gradual degradation like belt loseness. Key takeaways: 

  • KDE enables comparison of full vibration distributions rather than single features. 
  • JSD and Hellinger are ideal for stable trend monitoring and early fault detection. 
  • Wasserstein distance provides intuitive sensitivity to amplitude shifts. 
  • Z-axis (normal to the ground) and resultant signals are the most reliable indicators for belt condition monitoring. 

It should be noted that KDE can be sensitive to factors such as sensor orientation and location. A robust data preprocessing pipeline is therefore critical for accurate and reliable condition monitoring. 


Resources, Scripts, and Files

 

 

Download Python Script

Share This

Hamed Dabiri

Hamed earned his PhD in 2022, focusing on ML applications in structural dynamics. He then completed a postdoctoral fellowship at Imperial College London, studying Material Strength and ML. With over 27 published papers, his research spans ML, structural dynamics, and material behavior. In 2024, Hamed became a Senior ML Engineer at Mide Technology, applying his...

By signing up you are agreeing to our Privacy Policy