Pandas is one of the most powerful and widely used libraries for data manipulation and analysis in Python. It provides data structures like DataFrames and Series, allowing users to work with structured data efficiently.
In this guide, we will explore essential Pandas functions with examples to help you master data handling in Python.
To install Pandas, use the following command:
pip install pandas
Once installed, you can import Pandas in your script:
import pandas as pd
A Pandas Series is a one-dimensional array with labels (index).
import pandas as pddata = [10, 20, 30, 40]series = pd.Series(data, index=['A', 'B', 'C', 'D'])print(series)
A DataFrame is a two-dimensional table with labeled axes (rows and columns).
import pandas as pddata = { 'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35], 'City': ['New York', 'Los Angeles', 'Chicago']}df = pd.DataFrame(data)print(df)
Pandas allows reading data from various sources like CSV, Excel, and SQL databases.
df = pd.read_csv('data.csv')print(df.head()) # Displays the first 5 rows
print(df['Name']) # Select a single columnprint(df[['Name', 'Age']]) # Select multiple columnsprint(df.iloc[0]) # Select first rowprint(df.loc[0, 'Name']) # Select specific value
filtered_df = df[df['Age'] > 25]print(filtered_df)
df.dropna(inplace=True) # Remove missing valuesdf.fillna(0, inplace=True) # Replace missing values with 0
df['Age'] = df['Age'].astype(float)
sorted_df = df.sort_values(by='Age')grouped_df = df.groupby('City').mean()print(grouped_df)
Pandas is an essential tool for data analysis in Python. It simplifies data manipulation, cleaning, and transformation, making it a must-know library for anyone working with structured data.
Mastering Pandas will enable you to work efficiently with large datasets and perform complex data operations with ease.