Working with Pandas: A Beginner’s Guide

1 min read .

Pandas is a powerful library for data manipulation and analysis in Python. We’ll explore some fundamental operations you can perform using Pandas, including renaming columns, adding, updating, and deleting data, and sorting and filtering DataFrames.

Installing Pandas

To get started, you need to have Pandas installed. You can easily install it using pip:

pip install pandas

Creating a DataFrame

Let’s begin by creating a simple DataFrame. Here’s how you can initialize a DataFrame with some sample data:

import pandas as pd

df = pd.DataFrame(
    {
        "Name": [
            "Braund, Mr. Owen Harris",
            "Allen, Mr. William Henry",
            "Bonnell, Miss. Elizabeth",
        ],
        "Age": [22, 35, 58],
        "Sex": ["male", "male", "female"],
    }
)

Renaming Columns

If you need to rename columns, you can do so with the rename method. For example, let’s rename the “Sex” column to “Gender”:

df.rename(columns={"Sex": "Gender"}, inplace=True)

Adding Data

Adding new rows to a DataFrame is straightforward. Here’s how you can add a new row of data:

df.loc[len(df)] = ["Smith, Mr. John", 28, "male"]

Updating Data

To update specific data in the DataFrame, you can use the loc indexer. For instance, let’s update the age of “Allen, Mr. William Henry”:

df.loc[df['Name'] == "Allen, Mr. William Henry", 'Age'] = 36

Deleting Data

You can delete rows and columns in various ways:

  • Delete by Name: To remove a row based on a condition, use boolean indexing:

    df = df[df['Name'] != "Allen, Mr. William Henry"]
  • Delete by Index: To remove a row by its index, use the drop method:

    df = df.drop(3)
  • Delete Column: To remove a column, use the drop method specifying axis=1:

    df = df.drop(columns=['Age'])

Sorting Data

Sorting data is easy with Pandas. You can sort your DataFrame in ascending or descending order:

  • Sort Ascending:

    sortAscending = df.sort_values(by="Age")
  • Sort Descending:

    sortDescending = df.sort_values(by="Age", ascending=False)

Filtering Data

Filtering allows you to extract rows based on certain conditions:

  • Filter Age Below 30:

    filterBelow30 = df[df['Age'] < 30]
  • Filter Specific Age:

    filterAge = df[df["Age"] == 35]

Conclusion

Pandas provides a robust set of tools for data manipulation and analysis. By mastering these basic operations, you can efficiently clean, transform, and analyze your data. Stay tuned for more advanced topics in data analysis with Pandas!

Tags:
Python

See Also

chevron-up