It is always nice to process the data using modern tools like Pandas or Jupyter. But let’s imagine the case when a colleague or friend asks to make a data analysis, but he or she is not a technical person, does not use Python or Jupyter, and does not have any account in Tableau, Power BI, or any other fancy (but, alas, not free) service. In this case, processing the data in Google Sheets can be a nice workaround because of several reasons:
- Google is used worldwide; at the time of writing this article, more than 1.8 billion users have a Google Account. Practically almost everyone has a Google account nowadays, and document sharing will be extremely easy.
- Google’s ecosystem is safe and secure. It supports two-factor authentication and modern security standards, and even private datasets can be shared between limited groups of people.
- Last but not least, the solution is free and does not require any extra costs. And as a bonus, Google Sheets works in the browser, does not require installing any software, and can work on any platform like Windows, Linux, OSX, or even on a smartphone.
In this article, I will make a basic exploratory data analysis in Pandas, then we will repeat this process in Google Sheets and see how it works.
To make things more fun, let’s use a real dataset. We will make a tool to calculate the energy generated by solar panels. To do this, I will use the PVGIS (European Commission Photo Voltaic Geographical Information System) data, which can be accessed for free via this URL (CC BY 4.0 Licence):
Using this page, we can download solar irradiation data, allowing us to calculate energy generation. As can be seen in the screenshot, we can select hourly data for different years and different locations. After downloading the data, let’s use it in Pandas.
EDA in Pandas
Let’s start with exploratory data analysis (EDA) in Pandas. It’s always easier to…