freecodecamp-data-analysis/demographic-data-analyzer

T

ngschaider c887194ed6 Unit tests raising Error. (#2 )

* fixed assertAlmostEqual calls in tests

As you can see in the docs the signature of assertAlmostEqual is `assertAlmostEqual(first, second, places=7, msg=None, delta=None)`
Since the message is originally provided using a positional argument and only three arguments where given, the message effectively took the place of the `places` argument.

The default of places=7 is fine, we just have to use a keyword argument instead and this will resolve the resulting `TypeError`s when testing

* change assertAlmostEqual to assertEqual for string

assertAlmostEqual only makes sense when asserting ints/floats

* no need for keyword argument

Signature of method is `assertEqual(first, second, msg=None)`
Using a keyword argument is not required here.

Co-authored-by: ngschaider <gschaiderniklas@gmail.com>

2021-02-03 19:07:02 -06:00

.replit

feat: add replit config (#1 )

2020-09-29 14:49:41 -07:00

adult.data.csv

init

2020-09-29 12:23:25 -05:00

demographic_data_analyzer.py

init

2020-09-29 12:23:25 -05:00

main.py

init

2020-09-29 12:23:25 -05:00

poetry.lock

init

2020-09-29 12:23:25 -05:00

pyproject.toml

init

2020-09-29 12:23:25 -05:00

README.md

init

2020-09-29 12:23:25 -05:00

test_module.py

Unit tests raising Error. (#2 )

2021-02-03 19:07:02 -06:00

README.md

Assignment

Demographic Data Analyzer

In this challenge you must analyze demographic data using Pandas. You are given a dataset of demographic data that was extracted from the 1994 Census database. Here is a sample of what the data looks like:

	age	workclass	fnlwgt	education	education-num	marital-status	occupation	relationship	race	sex	capital-gain	hours-per-week	native-country	salary
0	39	State-gov	77516	Bachelors	13	Never-married	Adm-clerical	Not-in-family	White	Male	2174	40	United-States	<=50K
1	50	Self-emp-not-inc	83311	Bachelors	13	Married-civ-spouse	Exec-managerial	Husband	White	Male	0	13	United-States	<=50K
2	38	Private	215646	HS-grad	9	Divorced	Handlers-cleaners	Not-in-family	White	Male	0	40	United-States	<=50K
3	53	Private	234721	11th	7	Married-civ-spouse	Handlers-cleaners	Husband	Black	Male	0	40	United-States	<=50K
4	28	Private	338409	Bachelors	13	Married-civ-spouse	Prof-specialty	Wife	Black	Female	0	40	Cuba	<=50K

You must use Pandas to answer the following questions:

How many people of each race are represented in this dataset? This should be a Pandas series with race names as the index labels. (race column)
What is the average age of men?
What is the percentage of people who have a Bachelor's degree?
What percentage of people with advanced education (Bachelors, Masters, or Doctorate) make more than 50K?
What percentage of people without advanced education make more than 50K?
What is the minimum number of hours a person works per week?
What percentage of the people who work the minimum number of hours per week have a salary of more than 50K?
What country has the highest percentage of people that earn >50K and what is that percentage?
Identify the most popular occupation for those who earn >50K in India.

Use the starter code in the file demographic_data_anaylizer. Update the code so all variables set to "None" are set to the appropriate calculation or code. Round all decimals to the nearest tenth.

Unit tests are written for you under test_module.py.

Development

For development, you can use main.py to test your functions. Click the "run" button and main.py will run.

Testing

We imported the tests from test_module.py to main.py for your convenience. The tests will run automatically whenever you hit the "run" button.

Submitting

Copy your project's URL and submit it to freeCodeCamp.

Dataset Source

Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.