Data analysis and visualization python code

This example assumes that you have already installed the necessary libraries such as NumPy, Pandas, and Matplotlib.

# Import necessary libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Load data into a Pandas dataframe
df = pd.read_csv("data.csv")

# Print the first few rows of the dataframe
print(df.head())

# Get summary statistics of the dataframe
print(df.describe())

# Calculate the correlation between columns
print(df.corr())

# Create a scatter plot of two columns
plt.scatter(df['column1'], df['column2'])
plt.xlabel('Column 1')
plt.ylabel('Column 2')
plt.show()

# Create a bar chart of a categorical column
counts = df['category_column'].value_counts()
plt.bar(counts.index, counts.values)
plt.xlabel('Category')
plt.ylabel('Count')
plt.show()

In this example, we load a CSV file into a Pandas dataframe, and then perform basic data analysis and visualization tasks. The head() function is used to print the first few rows of the dataframe, while describe() calculates summary statistics such as mean, standard deviation, and quartiles. The corr() function calculates the correlation between columns.

For visualization, we create a scatter plot of two columns using scatter(), and a bar chart of a categorical column using bar(). The xlabel() and ylabel() functions are used to set the labels for the x and y axes, and the show() function is used to display the plot.

Note that this is just a simple example, and there are many more data analysis and visualization techniques that can be performed using Python.

Leave a Reply