Introduction to Pandas

Introduction to Pandas

In this Blog, I will be writing about all the basic stuff you need to know about Pandas.

  • What is Pandas?

    Pandas is an open-source library that is made mainly for working with relational or labeled data .This library is built on the top of the NumPy library. Pandas is fast and it has high-performance

  • Why do we use Pandas?

    Through pandas, you get aware with your data by cleaning, transforming, and analyzing it. For example , we want to explore a dataset stored in a csv format. Pandas will convert CSV into dataframe - a table basically and then let you perform many tasks easily.

  • Average,max,median or min of each column

  • Clean the data by doing things like removing missing values and filtering rows or columns by some criteria

  • Visualize the data using data visualisation libraries like matplotlib,seaborn
  • Getting started with Pandas

    Install Pandas

      pip install pandas
    

    Sample Data - Here I will use the famous Iris dataset .It comprises of the sepal length and petal length of the flowers.

Screenshot 2022-10-13 at 10.04.15 PM.png

Load Data into Pandas

  import pandas as pd
  df = pd.read_csv("csv_file_path")

The above code snippet reads data from a source and loads it into Pandas internal data structure called DataFrame.

Understanding Data

  # 1. access the first n rows of a dataframe 
  df.head()
  # 2. Some statistical information about your data
  df.describe()

Screenshot 2022-10-13 at 10.18.02 PM.png

Screenshot 2022-10-13 at 10.17.01 PM.png

I tried to provide all the important information on pandas for beginners. I hope you will find something useful here. Happy Learning !!