enDAQ Blog for Data Sensing and Analyzing

Get Started with Python: Why and How Mechanical Engineers Should Make the Switch

Written by Steve Hanly | Sep 1, 2021 5:34:35 PM

Any engineer will have heard of Python by this stage. But if you are a typical mechanical engineer, it's unlikely you and your team will be using Python for your data analysis.

This is starting to change (as evidenced by the fact you are reading this blog!) now as mechanical engineers are abandoning MATLAB in greater numbers in favor of Python.

In this blog, I'll explain why this migration is happening and how to get started. I'll also provide an example set of code you can run right in your browser without installing anything. It loads in some data (any dataset!), plots it, and then does some basic time domain and frequency domain analysis (probably hitting the majority of what you're using MATLAB for anyways). There are a TON of online resources that provide more detail on how to get started so I'll also point you to some good ones.

I'll cover:

  1. Why Learn Python?
    1. Popularity
    2. It's Free
    3. It Can Do More
  2. When Is Python Not the Best Choice?
  3. Which is faster, MATLAB or Python?
  4. Getting Started with Python
    1. What to Download & From Where
    2. Environment to Code (Spyder, Pycharm, Jupyter Lab, Google Colab)
  5. Interactive & Free Python Example to Do Basic Vibration Analysis
  6. Resources
  7. Conclusion & Poll

Why Learn Python

There are three major reasons to make the switch to Python:

  1. It is more popular (for good reason)
  2. It is free in cost and in terms of flexibility
  3. It can do more!

I'll also answer these two questions you typically get from MATLAB users who are hesitant to make the switch:

  1. Any exceptions where MATLAB is better?
  2. MATLAB is faster, right? Wrong!

Python is WAY More Popular

Python is over ten times more popular than MATLAB according to the TIOBE index which ranks Python 2nd, and MATLAB 17th on their list of the most popular programming languages. The rankings are determined by analyzing search engine data.

I'm not normally one to do something just because everyone else is doing it. I tend to like being a bit against the tide! But in the case of programming, adopting a more popular language has the major benefit of giving you access to way more examples and a larger community of people that can help you learn.

The site Stack Overflow offers a platform for developers to ask questions and collaborate on code. They also share some cool data on trends about the type of questions being answered. The following plot (source from Stack Overflow) plots the percentage of monthly questions that were asked about MATLAB compared to three core Python packages that are used for data analysis and plotting: Pandas, NumPy & Matplotlib (more on Python packages later).

Questions about MATLAB peaked in 2015, but since then the interest has been dropping while the adoption of Python equivalent tools is rising even faster. In 2021, MATLAB accounted for about 0.2% of the questions on Stack Overflow. Meanwhile, the core data analysis and plotting libraries in Python accounted for about 4% of all the questions. The engineering and data analysis community as a whole are adopting Python in droves; it's time mechanical engineers do it too!

Python is open source meaning anyone can contribute code to the community using Python. This makes the increasing popularity of Python help exponentially grow the capabilities and performance of the programming language along with it.

Python is Free

Python is free, it does not cost anything to begin developing with it. Meanwhile, MATLAB by Mathworks costs $2,150 per user per year. And this is just for the base license. They also then sell a variety of Toolboxes that cost anywhere from an additional $1,000 per year up to some packages costing over $10,000 per year.

Not only is Python free in terms of cost, but it is also free in terms of flexibility. Because it is open source and adopted by such a larger audience, there are a number of different ways to use Python. This means there are many different user interfaces (some we'll show in a moment). You can use it on any operating system (Windows, Apple, Linux etc.) without the need for a new license. And there are now over 300,000 freely available packages available on PyPi that offer code and functionality you can use -- and even modify -- with complete freedom!

Python Can Do More

Python can do and excel at the data analytics we typically do in MATLAB. Yet it can do so much more! It is used to run major websites like Reddit, Netflix, Instagram, and even Google. It interfaces with cloud-based data much more easily and it has more readily available and capable machine learning libraries. MATLAB, meanwhile, will try to convince you to buy another toolbox. 

The above was a long way of saying that in general...

Anything that can be done in MATLAB can be replicated in Python with the right package. The reverse, however, is not true.

 

Any Exceptions?

I included a caveat in that last statement because there are some edge cases where there aren't readily available and comparable packages in Python... yet!

These generally boil down to some of MATLAB's very particular toolboxes like their communications ones (Satellite Communications, LTE, and 5G) and some other oddities (although like most things in Python, it may just take some digging to find the right comparable package). MathWorks (owns MATLAB) offers Simulink which is a modeling and simulation environment with block diagrams that also has no clear alternative in Python.

But I'm willing to bet the vast majority of the mechanical engineers out there who use MATLAB don't rely on these toolboxes meaning they can easily switch to Python with no loss in performance! For general instrumentation and data analysis needs, anything that can be done in MATLAB (or even LabVIEW) can be replicated in Python (but remember the reverse isn't true).

But MATLAB is Faster Right? Wrong!

I started getting Python-curious back in 2016 for a number of reasons including the fact the team writing enDAQ software was doing it in Python (another reason was quantitative finance, which was fun!). But the mechanical engineering side of the company, including all the executives, were very much MATLAB users.

I heard from these old-school MATLAB lovers that it will be faster than Python because it uses compiled Fortran and C libraries and Python doesn't. 

But like a lot of things in Python... that isn't a complete answer, you just need to use the right library.

NumPy, Pandas, and SciPy are the predominant libraries in Python for data analysis and they do actually use compiled C and Fortran code under the hood in much the same way that MATLAB does. 

So I set out to compare Python to MATLAB for loading data and doing some basic vibration analysis and wrote a blog published in August 2016: MATLAB vs Python: Speed Test for Vibration Analysis [Free Download]. At first, my lack of familiarity with Python led to my MATLAB code far outperforming the Python version. But then commenters (another sign of the power in community and popularity) helped me improve performance. I wasn't using the right libraries (I needed to use Pandas and FFTW, see the blog for more detail)! These improvements resulted in Python beating MATLAB for basic vibration analysis back in late 2016. Here is a table of the final results explained in that blog:

Now, just to tease how cool Python is and something made possible with the Plotly library, here's an interactive plot of this data generated in Python:

Long story short: if you have an example of some data analysis algorithm that runs faster in MATLAB compared to Python, I'm willing to bet there is a different way to implement that algorithm in Python that will indeed be faster than MATLAB.

For those particularly worried about Python's speed at execution, there is a compiler available called Numba which can translate Python code into more native machine code. There are some impressive examples out there like this one which compares a simple algebraic calculation speed in Python (with and without Numba), MATLAB and Fortran. Spoiler alert: the Python code with Numba ran 10x faster than MATLAB (Python and NumPy ran faster even without Numba) and even faster than Fortran.

Getting Started

This is one area where at first I thought MATLAB had the advantage. Getting started is easy with MATLAB because there's really only one choice of where to get it and the environment to code in. Python is different though and there are a lot of choices here! I'll suggest a few options I personally like and think will be best suited to folks familiar with MATLAB.

What to Download & From Where

There are four options here in order of "completeness."

  1. Install Python Directly
  2. Miniconda
  3. Anaconda
  4. Google Colab

Install Just Python

To download Python you can go directly to the source at Python.org. The trouble here though is that it won't come with any packages or coding environments. So this will probably not be the best thing to do for someone used to MATLAB, it will feel too primitive!

Miniconda

A slightly more complete option would be to install Miniconda but admittedly it is going to feel pretty primitive to the MATLAB user because it will lack a user interface initially and require that you go and download the IDE you want to use.

Anaconda

Downloading the Anaconda distribution, individual edition, is by far the recommended method for the new Python user, especially someone coming from MATLAB. This will come complete with over 600 common packages and a few different environments for writing and executing code, which we'll talk about next.

There is one caveat here: a company actually manages this distribution and they updated their terms of service in mid-2020 to disallow "heavy commercial users" from downloading the free version and to instead purchase a commercial edition that starts at $15/month. But their definition of "heavy commercial user" is likely not going to be the reader of this article because you will not be mirroring their repository regularly.

But if your organization is leery of open source because you want to have trust that a company is helping manage this for you and will continue to support it, it may be a good idea to consider their commercial licenses. These are still much lower cost than MATLAB and offer some advantages of sharing and collaborating on code with colleagues.

Google Colab

The last option I'll suggest to getting started is to just visit Google Colaboratory (Colab for short). This is the lowest lift to getting started in Python. You simply open it in a browser and you're up and running with writing and executing Python right away, no download or installations required! In my previous examples, I mentioned that you need to go install all these Python libraries to get started, but Colab comes ready with a bunch of pre-installed libraries (you can install others if needed, too, of course).

Think of Colab as Google Docs for Python. These are based on Jupyter Notebooks (more on that later) that provide a very rich environment to explaining your research, writing code, and then generating interactive outputs like plots. In the Google environment, this also makes it very easy to share and collaborate (as the name implies) with others on a data analysis and research project. Here's a quick 3 minute video introducing Colab:

Google Colab is totally free but they offer paid versions starting at $10/month that provide more computing power and longer runtimes. The only downside of Colab is that it can't easily access locally saved information on your computer or network. So if you are going to be doing a lot of analysis on data that is somewhat private, Colab may not be for you. But it is still a great way to get started!

Environment to Code

In Python not only are there a near-infinite number of libraries to use, but you have countless ways of writing and executing code. Here I'll provide a few examples I'd recommend you be aware of as a beginner.

Spyder

Spyder comes with the Anaconda distribution which will be a plus (no additional download required) and it has a very similar interface to MATLAB so it will be a logical place to start if you like and are used to that environment. It will include:

  • Variable explorer (very nice!)
  • Editor for writing code
  • Console (similar to command window in MATLAB)
  • Integrated help
  • History
  • And more!

Jupyter Lab

Jupyter Notebooks allow you to create and share documents that contain live code, equations, visualizations, and rich narrative text in markdown. And this is all done in a web browser! 

Data analysis tends to be iterative and exploratory and will ideally result in a report that you share with your stakeholders. In a Jupyter Notebook, shown below, this is all done in one place which can be exported as a PDF or interactive HTML file. These are so good for data analysis, MATLAB copied it with their "Live Editor."

Jupyter Lab is a follow-on from the original Jupyter Notebook which I fell in love with the first time I used it. In Jupyter Lab you have a much more modular user interface to include different sub-windows for multiple code editors, outputs, variable explorers etc. It gives you the best things from Spyder (and what you'd expect in MATLAB) with what you love about Jupyter Notebooks.

Google Colab

I already mentioned Google Colaboratory when providing ways to get started installing Python because Colab is a completely self-contained system. But it is so good (and relevant) it needs to be included in this section, too, on environments to code in! 

Google Colabs are basically Google-hosted Jupyter notebooks that make it super easy to collaborate with others.

PyCharm

I initially didn't include PyCharm in the introduction to Python but one of my software engineers convinced me to. This is the environment that "proper" developers will use because it is meant for creating and managing "projects" which would include many different .py files for building a complete software program. So, to support this, PyCharm offers nice refactoring, debugging, testing, and other tools. It also includes code completion and assistance which can significantly speed up writing code (oftentimes you find yourself typing the same line in multiple places!).

But I think for the Python beginner this is going to be too much, but it's good to know it is available!

Basic Interactive & Free Example

In this Google Colab we'll go through an example that does the following. This will include the ability for you to upload your own data to freely analyze without downloading anything!

  1. Load CSV (assuming time vs acceleration)
  2. Basic analysis in the time domain
  3. Plot full-time series with matplotlib
  4. Plot moving peak with Plotly
  5. Plot moving RMS with Plotly
  6. Plot time history around peak with Plotly
  7. Compute & plot PSD
  8. Cumulative RMS from the PSD
  9. Find and plot moving peak frequency

We have a follow-on notebook that dives deeper into NumPy and Pandas for data analysis. It also previews a bit of enDAQ's library for analyzing shock & vibration data. Here's the notebook!

Resources

Remember Python is enormously popular so there is a LOT of content available online to help you learn. There are also a few particularly good tutorials for folks familiar with MATLAB:

Also, here's a webinar we ran on Python for mechanical engineers. The beginning gives a similar overview to this blog, but at about 10 minutes in we start a live demonstration of using Python for vibration analysis using Colab.  


Last but not least, think of us at enDAQ as a resource to tap into, specifically if you are doing some analysis on shock & vibration data. We are also in process of finishing an open-source library to make some of the data analysis we know the customers of our sensors and the mechanical engineering community at large do regularly in MATLAB, but can be better done in Python!

Conclusion

I hope this post illustrated the value of switching to Python. As you can see there are tons of resources available to help get you started programming! As always, feel free to contact us if you have any questions or you can leave a comment below. And don't forget we're going to be running a webinar on this topic on September 14th at 12:00 (Eastern Time). Join us to see a live demonstration of using Google Colab to analyze data.  

And, if you have a few seconds, fill out our poll on Python and MATLAB. We always want to make sure our content is meeting our readers' needs, so your feedback and input is invaluable! 

Related Posts:

For more on this topic, visit our dedicated Vibration Loggers & Vibration Sensors resource page. There you’ll find more blog posts, case studies, webinars, software, and products focused on your vibration testing and analysis needs.