Microsoft’s recent announcement that Excel will now natively support Python has the potential to significantly advance data analysis capabilities for millions of advanced users of spreadsheets.
This integration of Python, one of the world’s most popular and powerful programming languages, with Excel, one of the most widely used business software applications, unlocks new possibilities for working with and deriving insights from data.
In this comprehensive blog post, I will provide an in-depth look at what bringing Python to Excel means, who stands to benefit, and how it could impact the future of data-driven business decisions.
What Is Python and Why Does It Matter for Data Analysis?
Python is an open-source, interpreted programming language that was first released in 1991 by its creator, Guido van Rossum.
Over the past three decades, Python has become one of the most widely used languages among programmers and data scientists. This is largely thanks to its simplicity, readability, vibrant community, and especially the vast collection of specialized libraries and frameworks for data analysis, machine learning, and more.
Some of the most popular and important Python libraries used for data analytics include:
- Pandas: Provides easy-to-use data structures and data manipulation/analysis tools that make handling structured data simple and intuitive. Pandas can clean, reshape, subset, and transform data sets.
- NumPy: Adds support for fast, multidimensional arrays and matrices, plus mathematical and statistical operations for working with numerical data.
- Matplotlib: Flexible plotting and charting library for data visualization and exploratory analysis.
- Scikit-Learn: Leading library for machine learning tasks like classification, regression, and clustering.
- SQLAlchemy: Allows Python to communicate with databases like SQL Server, PostgreSQL, MySQL, etc.
By connecting directly to these Python libraries from within Excel, users gain access to state-of-the-art tools for practically any data wrangling, analysis, or visualization challenge.
This represents a massive upgrade over Excel’s inherent capabilities.
How Python in Excel Works
Previously, the only way to extend Excel’s built-in functionality was via Visual Basic for Applications (VBA), Microsoft’s programming language for Office apps. VBA is limited, dated, and often disliked by programmers.
With the new integration, users can access Python’s capabilities in a few ways:
- Import Python libraries and write Python code directly in cells using the new =PY function
- Create custom Python functions and import them into Excel for reuse
- Use the xll.runpythonscript function to execute entire Python scripts
Python code can then work directly with the spreadsheet data. For example, the data from a sheet can be loaded into a Pandas DataFrame and analyzed using the full Pandas API.
To ensure stability and security, Python code does not execute locally – instead, it runs in Microsoft’s Azure cloud platform. This provides safety at the cost of giving Microsoft visibility into potentially sensitive data.
An alternative would be providing the option to run Python locally for privacy.
Unlock the Power of Python in Excel with a New Update
As non-programmers, we tested out Python in Excel first-hand. Based on our hands-on experience exploring this new feature, here are some of the most valuable ways regular users can maximize the Python integration:
Seamlessly Switch Between Excel Formulas and Python Code
With Python support, you can easily switch between using Excel formulas and writing Python code right within your workbooks.
Python code can be inserted using the new Python section in the Formulas tab or by simply typing “=PY” to create a Python formula cell.
When in Python mode, the formula bar changes to allow you to write Python scripts. Pressing Enter will add a new line to make it easier to write longer code. To run the code, use Ctrl + Enter instead of just Enter.
Create Python Objects from Excel Ranges
One powerful feature is the ability to convert Excel ranges into Pandas DataFrames in Python. Simply reference a cell range and Excel will automatically pass the data to Python and create a DataFrame.
You can view the DataFrame as a Python object or return the Excel values. The DataFrame gives you access to the extensive data analysis functionality of Pandas and other Python libraries on your Excel data.
Name Python Objects for Easier Referencing
To simplify referencing a DataFrame, you can assign it a name like “df”. Then you can just use df instead of re-referencing the cell range.
Use Python to Easily Aggregate and Analyze Data
With Python integration, you can use functions like .groupby() and .sum() to easily group and aggregate your Excel data without formulas or pivot tables.
For example, you can group sales data by date and sum the values to get totals per date. Changing the source data will instantly update Python results since they are formula-driven.
Visualize Data Directly in Cells with Python
You can use Python’s visualization libraries like Matplotlib to generate charts and graphs directly within cells.
For example, create a line chart of sales per month with just a few lines of code. Adjust the chart type, frequency, and other attributes simply by changing the Python script.
Since it is a Python formula, updating the source data will automatically refresh the chart.
Import Python Libraries to Enhance Functionality
While Pandas and Matplotlib are available by default, you can import any Python library to access more functionality.
For example, import Python’s RE library for regular expressions to extract URLs from text strings in a table. The formulas will dynamically pull URLs from any new rows added.
Connect to Power Query Data Sources
Python formulas can connect to Power Query data sources in your workbook without needing to first load the data. This allows Python to access larger datasets while keeping your workbook lightweight.
Who Stands to Benefit from Python in Excel?
There are a few major groups who will find Excel’s Python support valuable:
Casual Excel Users
Excel is used by hundreds of millions of people worldwide, most with no programming experience. For these casual users, the ability to leverage Python’s capabilities will unlock new ways of working with data.
Tasks like combining data sets, cleaning data, advanced calculations, and creating insightful visualizations can now be achieved through Python code generated by assistants like ChatGPT.
This provides access to sophisticated data science techniques without needing to learn to code.
Experienced Data Analysts
Data analysts who are already familiar with Python will appreciate being able to leverage their existing skills directly within Excel.
Switching between languages is no longer required, and Python’s capacity for automating repetitive tasks can be applied to Excel workflows. Analyzing, modeling, and visualizing data is streamlined.
Developers
For developers, the addition of Python custom functions allows for extending Excel’s default functionality with advanced logic tailored to specific use cases.
Building reusable tools, running models, connecting to databases, and more can now be done natively in Excel using mainstream Python packages.
Enterprises Reliant on Excel
Many large companies have vast Excel-based systems and processes that would be impractical to replace wholesale.
For these organizations, integrating Python into Excel allows empowering users with more advanced data tools while maintaining compatibility with existing infrastructure, data sets, and workflows.
This improves capabilities without the risks and costs of a ground-up rebuild.
What’s the Impact on Microsoft and the Future of Data Analytics?
Microsoft has strategic motivations for bringing Python to Excel:
Embrace Developers
By supporting Python, Microsoft appeals to the large community of data science practitioners who prefer Python over proprietary languages like VBA. This embraces developers where they are rather than forcing them into the Microsoft ecosystem.
Extend Capabilities
Access to Python libraries directly within Excel greatly expands what can be accomplished by end users. This “extending” of capabilities could motivate upgrades to Microsoft 365 subscriptions.
Expand Cloud Usage
Running Python code in Azure Cloud instead of locally incentivizes increased cloud usage, which is more profitable for Microsoft. It also gives them access to data and code for improving their own AI capabilities.
Increase Enterprise Reliance
Many large companies are essentially “locked in” to Excel due to decades of accumulating Excel-based data, models, and processes.
By making Excel more capable via Python integration, these enterprises have less incentive to explore alternatives, further entrenching Microsoft’s position in the business world.
The Python integration cements Excel’s future as a platform for data analysis and modeling within businesses even as user needs evolve.
It allows enterprises to build institutional knowledge in Python while keeping it directly connected to existing Excel-based workflows. Individual users also obtain access to leading-edge data science capabilities without the disruption of adopting entirely new tools.
Microsoft has deftly positioned itself at the center of the data analysis landscape – a position that brings both innovation opportunities and ethical questions related to transparency and data privacy.
Nonetheless, the arrival of Python in Excel is a win for end users who will become empowered to do more with data than ever before.