본문 바로가기
Project ( Portfolio )/Mini Project

Mini Project (USA Balance Sheet, Select Years 1980-2020)

by jhleeatl 2024. 5. 3.

 

Today, in order to study Python, I brought USA balance sheet data from Kaggle. The balance sheet is from the US federal, state, and local governments

 

I plan to analyze this data in various ways.

 

 

https://www.kaggle.com/datasets/sqlrockstar/usa-balance-sheet-select-years-1980-2020/data

 

USA Balance Sheet, Select Years 1980-2020

A snapshot of the balance sheet for US federal, state, and local governments.

www.kaggle.com

 

 

First of all, I import data and display all the rows as there are not many rows

import pandas as pd
file_path = "/Users/junhyunlee/Desktop/data/usa_balance_sheet.csv"
pd.set_option('display.max_rows', None) #display all the rows
df = pd.read_csv(file_path)

Result

 

 

And here is the data information.

print(df.info())


#Result

Data columns (total 15 columns):
 #   Column                                            Non-Null Count  Dtype 
---  ------                                            --------------  ----- 
 0   Year                                              8 non-null      int64 
 1   Revenue                                           8 non-null      int64 
 2   Expenditures                                      8 non-null      int64 
 3   Surplus_(deficit)                                 8 non-null      object
 4   Cash_cash_equivalents_and_short_term_investments  8 non-null      int64 
 5   Total_assets                                      8 non-null      int64 
 6   Total_liabilities                                 8 non-null      int64 
 7   Net_worth                                         8 non-null      object
 8   Adj_Revenue                                       8 non-null      int64 
 9   Adj_Expenditures                                  8 non-null      int64 
 10  Adj_Surplus_(deficit)                             8 non-null      object
 11  Adj_Cash_cash_equivalents_short_term_investments  8 non-null      int64 
 12  Adj_Total_assets                                  8 non-null      int64 
 13  Adj_Total_liabilities                             8 non-null      int64 
 14  Adj_Net_worth                                     8 non-null      object
dtypes: int64(11), object(4)
memory usage: 1.1+ KB

 

 

Checking Null value, there are not Null values in the data.

print(df.isnull().sum())

#result

Year                                                0
Revenue                                             0
Expenditures                                        0
Surplus_(deficit)                                   0
Cash_cash_equivalents_and_short_term_investments    0
Total_assets                                        0
Total_liabilities                                   0
Net_worth                                           0
Adj_Revenue                                         0
Adj_Expenditures                                    0
Adj_Surplus_(deficit)                               0
Adj_Cash_cash_equivalents_short_term_investments    0
Adj_Total_assets                                    0
Adj_Total_liabilities                               0
Adj_Net_worth                                       0
dtype: int64

 

 

In the data information table, some of the columns are in object format. The reason for this is the accounting format. 

Negative amounts were written in () format, so we need to change these numbers to negative values in integer format.

 

Before

import pandas as pd
file_path = "/Users/junhyunlee/Desktop/data/usa_balance_sheet.csv"
pd.set_option('display.max_rows', None)
df = pd.read_csv(file_path)

print(df[['Surplus_(deficit)', 'Net_worth', 'Adj_Surplus_(deficit)', 'Adj_Net_worth']])

#output

  Surplus_(deficit)  Net_worth Adj_Surplus_(deficit) Adj_Net_worth
0              (63)       717                  (203)        2,313 
1             (179)        42                  (359)           84 
2              410        866                   619         1,309 
3             (186)      (702)                 (248)         (928)
4           (1,199)    (3,600)               (1,423)       (4,264)
5             (491)    (6,906)                 (535)       (7,522)
6             (960)    (8,055)                 (974)       (8,170)
7           (3,085)   (10,444)               (3,085)      (10,444)

 

 

After

import pandas as pd
file_path = "/Users/junhyunlee/Desktop/data/usa_balance_sheet.csv"
pd.set_option('display.max_rows', None)
df = pd.read_csv(file_path)

# get rid of () and convert to int format
columns_to_fix = ['Surplus_(deficit)', 'Net_worth', 'Adj_Surplus_(deficit)', 'Adj_Net_worth']
for column in columns_to_fix:
    df[column] = df[column].str.replace('(', '-').str.replace(')', '')  # delete () and change to '-'
    df[column] = df[column].str.replace(',', '')  # delete ','
    df[column] = df[column].astype(int)  # convert to int format

print(df.info())

#out
 #   Column                                            Non-Null Count  Dtype
---  ------                                            --------------  -----
 0   Year                                              8 non-null      int64
 1   Revenue                                           8 non-null      int64
 2   Expenditures                                      8 non-null      int64
 3   Surplus_(deficit)                                 8 non-null      int64
 4   Cash_cash_equivalents_and_short_term_investments  8 non-null      int64
 5   Total_assets                                      8 non-null      int64
 6   Total_liabilities                                 8 non-null      int64
 7   Net_worth                                         8 non-null      int64
 8   Adj_Revenue                                       8 non-null      int64
 9   Adj_Expenditures                                  8 non-null      int64
 10  Adj_Surplus_(deficit)                             8 non-null      int64
 11  Adj_Cash_cash_equivalents_short_term_investments  8 non-null      int64
 12  Adj_Total_assets                                  8 non-null      int64
 13  Adj_Total_liabilities                             8 non-null      int64
 14  Adj_Net_worth                                     8 non-null      int64

 

 

 

Or you can create new column with negative amount by using fomular.

 

basically revenue - expenditure = surplus or deficit and asset - liabilities = net worth

 

 

df["Surplus_clean"] = df["Revenue"] - df["Expenditures"]
df["Net_worth_clean"] = df["Total_assets"] - df["Total_liabilities"]
df["Adj_Surplus_clean"] = df["Adj_Revenue"] - df["Adj_Expenditures"]
df["Adj_Net_worth_clean"] = df["Adj_Total_assets"] - df["Adj_Total_liabilities"]

columns = df.columns.tolist()

print(data.info())

#output
<class 'pandas.core.frame.DataFrame'>
Int64Index: 8 entries, 1980 to 2020
Data columns (total 18 columns):
 #   Column                                            Non-Null Count  Dtype 
---  ------                                            --------------  ----- 
 0   Revenue                                           8 non-null      int64 
 1   Expenditures                                      8 non-null      int64 
 2   Surplus_(deficit)                                 8 non-null      object
 3   Cash_cash_equivalents_and_short_term_investments  8 non-null      int64 
 4   Total_assets                                      8 non-null      int64 
 5   Total_liabilities                                 8 non-null      int64 
 6   Net_worth                                         8 non-null      object
 7   Adj_Revenue                                       8 non-null      int64 
 8   Adj_Expenditures                                  8 non-null      int64 
 9   Adj_Surplus_(deficit)                             8 non-null      object
 10  Adj_Cash_cash_equivalents_short_term_investments  8 non-null      int64 
 11  Adj_Total_assets                                  8 non-null      int64 
 12  Adj_Total_liabilities                             8 non-null      int64 
 13  Adj_Net_worth                                     8 non-null      object
 14  Surplus_clean                                     8 non-null      int64 
 15  Net_worth_clean                                   8 non-null      int64 
 16  Adj_Surplus_clean                                 8 non-null      int64 
 17  Adj_Net_worth_clean                               8 non-null      int64 
dtypes: int64(14), object(4)
memory usage: 1.2+ KB

 

 

From now on, I will create financial analysis materials that can assist in making management decisions using this data.

 

As there are no missing values in the given data, I will omit discussing the missing value treatment.

 

Additionally, I plan to analysie the data roughly categorized as follows:

 

Yearly comparison of revenue and expenses: Analyzing the trends of income and expenses to determine profitability by comparing Revenue (income) and Expenditures (expenses) on a yearly basis.

 

Trend of net income: Analyzing the changes in Surplus (deficit) (net income or loss) over the years to assess the financial soundness of the organization.

 

Comparison of assets and liabilities: Analyzing the organization's capital structure by comparing Total assets and Total liabilities on a yearly basis.

 

Trend of net assets: Analyzing the changes in Net worth (net assets) over the years to observe the trend of capital increase or decrease in the organization.

 

Comparison of adjusted figures: Analyzing the adjusted financial status by comparing the above data with adjusted values. Adjusted Surplus (deficit) (adjusted net income or loss) and Adj_Net_worth (adjusted net assets) enable more accurate financial analysis.

 

 

Yearly comparison of revenue and expenses:

 

df[['Year', 'Revenue', 'Expenditures']].plot(x='Year')
plt.xlabel('Year')
plt.ylabel('Amount')
plt.title('Revenue, Expenditures Over the Years')
plt.legend()
plt.show()

 

 

 

Trend of net income: 

df[['Year', 'Surplus_clean']].plot(x='Year', kind='bar')
plt.xlabel('Year')
plt.ylabel('Amount')
plt.title('Surplus Over the Years')
plt.axhline(0, color='black', linewidth=0.5)
plt.legend()
plt.show()

 

 

This is the output of Surplus over the years.

 

 

 

Combinaton of Revenue, Expenditures, and Surplus over the year

 

input

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

file_path = "/Users/junhyunlee/Desktop/data/usa_balance_sheet.csv"
pd.set_option('display.max_rows', None)
df = pd.read_csv(file_path)
df["Surplus_clean"] = df["Revenue"] - df["Expenditures"]
df["Net_worth_clean"] = df["Total_assets"] - df["Total_liabilities"]
df["Adj_Surplus_clean"] = df["Adj_Revenue"] - df["Adj_Expenditures"]
df["Adj_Net_worth_clean"] = df["Adj_Total_assets"] - df["Adj_Total_liabilities"]

# Set Graph
plt.figure(figsize=(10, 6))

# Add line graph
plt.plot(df['Year'], df['Revenue'], marker='o', label='Revenue')
plt.plot(df['Year'], df['Expenditures'], marker='o', label='Expenditures')

# Add Bar graph
plt.bar(df['Year'], df['Surplus_clean'], label='Surplus')

# set the title of graph
plt.title('Revenue, Expenditures, and Surplus Over the Years')
plt.xlabel('Year')
plt.ylabel('Amount ($)')
plt.xticks(np.arange(1980, 2025, step=5))  # x value (5 years)

# range
plt.legend()

# print
plt.grid(True)
plt.tight_layout()
plt.show()

 

output

 

 

The combined results of the two graphs above provide a comprehensive overview of the company's financial performance by year, including revenue, expenditures, and surplus. While revenue growth is evident, expenditures have outpaced this increase, resulting in a negative surplus. This indicates that the company's financial situation was not favorable prior to the adjustments implemented.

 

 

Comparison of adjusted figures

 

Here is the adjusted value 

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

file_path = "/Users/junhyunlee/Desktop/data/usa_balance_sheet.csv"
pd.set_option('display.max_rows', None)
df = pd.read_csv(file_path)
df["Surplus_clean"] = df["Revenue"] - df["Expenditures"]
df["Net_worth_clean"] = df["Total_assets"] - df["Total_liabilities"]
df["Adj_Surplus_clean"] = df["Adj_Revenue"] - df["Adj_Expenditures"]
df["Adj_Net_worth_clean"] = df["Adj_Total_assets"] - df["Adj_Total_liabilities"]

# Set Graph
plt.figure(figsize=(10, 6))

# Add line graph
plt.plot(df['Year'], df['Adj_Revenue'], marker='o', label='Adj_Revenue')
plt.plot(df['Year'], df['Adj_Expenditures'], marker='o', label='Adj_Expenditures')

# Add Bar graph
plt.bar(df['Year'], df['Adj_Surplus_clean'], label='Adj_Surplus')

# set the title of graph
plt.title('Adj_evenue, Adj_Expenditures, and Adj_Surplus Over the Years')
plt.xlabel('Year')
plt.ylabel('Amount ($)')
plt.xticks(np.arange(1980, 2025, step=5))  # x value (5 years)

# range
plt.legend()

# print
plt.grid(True)
plt.tight_layout()
plt.show()

 

 

The adjusted graph mirrors the overall trend observed in the first graph, further reinforcing the conclusion that the company's financial situation is not favorable. Notably, the data indicates a significant deficit in 2020.

 

 

 

Trend of net assets

 

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

file_path = "/Users/junhyunlee/Desktop/data/usa_balance_sheet.csv"
pd.set_option('display.max_rows', None)
df = pd.read_csv(file_path)
df["Surplus_clean"] = df["Revenue"] - df["Expenditures"]
df["Net_worth_clean"] = df["Total_assets"] - df["Total_liabilities"]
df["Adj_Surplus_clean"] = df["Adj_Revenue"] - df["Adj_Expenditures"]
df["Adj_Net_worth_clean"] = df["Adj_Total_assets"] - df["Adj_Total_liabilities"]

# Set Graph
plt.figure(figsize=(10, 6))

# Line graph
plt.plot(df['Year'], df['Total_assets'], marker='o', label='Total Assets')
plt.plot(df['Year'], df['Total_liabilities'], marker='o', label='Total Liabilities')
plt.plot(df['Year'], df['Net_worth_clean'], marker='o', label='Net Worth (Clean)')

# Title of graph and label
plt.title('Total Assets, Total Liabilities, and Net Worth Over the Years')
plt.xlabel('Year')
plt.ylabel('Amount ($)')
plt.xticks(df['Year'], rotation=45)  # x축 눈금 설정 (연도)

# range
plt.legend()

# print
plt.grid(True)
plt.tight_layout()
plt.show()

 

 

 

 

Comparison of adjusted figures

 

 

Conclusion

 

The US federal, state, and local governments


Financial Analysis of the Company Based on the data. This analysis is based on the provided balance sheet data for the company from 1980 to 2020.

 

Here's a breakdown of the key observations and potential areas for further investigation:

 


Revenue and Expenditures:

The company's revenue has grown steadily over the years, with a significant increase between 2000 and 2005.
However, expenditures have generally outpaced revenue growth, particularly between 2010 and 2020. This trend has resulted in a negative surplus (deficit) for most of the period.  The deficit became especially concerning in 2020, reaching its highest point at -$3,085.

 

 

Profitability:

The company's profitability, as measured by surplus, has been negative for most of the period. This suggests the company has been struggling to cover its expenses with its revenue. There were brief periods of positive surplus in the early 2000s, but these were not sustained.

 

 

Liquidity:

Cash and cash equivalents along with short-term investments (Adj_Cash_cash_equivalents_short_term_investments) have increased steadily over time, indicating potentially improving liquidity.
However, a more in-depth analysis of current assets and current liabilities would be necessary to assess the company's short-term solvency.

 

Solvency:

Total liabilities have consistently grown faster than net worth (Adj_Net_worth), indicating a higher reliance on debt financing. This could be a cause for concern if the company struggles to generate sufficient cash flow to service its debts. Net worth has become negative (Adj_Net_worth_clean) since 2010, highlighting the company's insolvency based on this metric.

 

 

Financial Adjustments:

The balance sheet includes adjusted figures (Adj_Revenue, Adj_Expenditures, etc.). It would be helpful to understand the underlying reasons for these adjustments to fully interpret the financial health of the company.

 


Overall:

The company's financial situation appears challenging. The consistent negative surplus and increasing debt raise concerns about its long-term viability.