Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

jsonstat.py Documentation, Exams of Italian

Italian

Notebook: using jsonstat.py python library with jsonstat format version ... import pandas as ps # using panda to convert jsonstat dataset to ...

Typology: Exams

2022/2023

Uploaded on 02/28/2023

eknath 🇺🇸

4.7

(29)

266 documents

1 / 50

This page cannot be seen from the preview

Don't miss anything!

jsonstat.py Documentation

Release 0.1.14

26fe

Aug 06, 2017

Partial preview of the text

Download jsonstat.py Documentation and more Exams Italian in PDF only on Docsity!

jsonstat.py Documentation

Release 0.1.

26fe

Aug 06, 2017

ii

jsonstat.py is a library for reading the JSON-stat data format maintained and promoted by Xavier Badosa. The JSON-

stat format is a JSON format for publishing dataset. JSON-stat is used by several institutions to publish statistical

data.

Contents 1

CHAPTER 1 Notebooks

Notebook: using jsonstat.py python library with jsonstat format ver-

sion 1.

This Jupyter notebook shows the python library jsonstat.py in action. The JSON-stat is a simple lightweight JSON

dissemination format. For more information about the format see the official site. This example shows how to explore

the example data file oecd-canada from json-stat.org site. This file is compliant to the version 1 of jsonstat.

all import here

from future import print_function import os import pandas as ps # using panda to convert jsonstat dataset to pandas dataframe import jsonstat # import jsonstat.py package

import matplotlib as plt # for plotting

%matplotlib inline

Download or use cached file oecd-canada.json. Caching file on disk permits to work off-line and to speed up the

exploration of the data.

url = 'http://json-stat.org/samples/oecd-canada.json' file_name = "oecd-canada.json"

file_path = os.path.abspath(os.path.join("..", "tests", "fixtures", "www.json-stat.org ˓→", file_name)) if os.path.exists(file_path): print("using already downloaded file {}".format(file_path)) else : print("download file and storing on disk") jsonstat.download(url, file_name) file_path = file_name

using already downloaded file /Users/26fe_nas/gioprj.on_mac/prj.python/jsonstat.py/ ˓→tests/fixtures/www.json-stat.org/oecd-canada.json

Initialize JsonStatCollection from the file and print the list of dataset contained into the collection.

collection = jsonstat.from_file(file_path) collection

Select the dataset named oedc. Oecd dataset has three dimensions (concept, area, year), and contains 432 values.

oecd = collection.dataset('oecd') oecd

Shows some detailed info about dimensions

oecd.dimension('concept')

oecd.dimension('area')

oecd.dimension('year')

Accessing value in the dataset

Print the value in oecd dataset for area = IT and year = 2012

oecd.data(area='IT', year='2012')

JsonStatValue(idx=201, value=10.55546863, status= None )

oecd.value(area='IT', year='2012')

oecd.value(concept='unemployment rate',area='Australia',year='2004') # 5.

oecd.value(concept='UNR',area='AU',year='2004')

Trasforming dataset into pandas DataFrame

df_oecd = oecd.to_data_frame('year', content='id') df_oecd.head()

df_oecd['area'].describe() # area contains 36 values

4 Chapter 1. Notebooks

[['indicator', 'OECD countries, EU15 and total', '2003-2014', 'Value'], ['unemployment rate', 'Australia', '2003', 5.943826289], ['unemployment rate', 'Australia', '2004', 5.39663128], ['unemployment rate', 'Australia', '2005', 5.044790587], ['unemployment rate', 'Australia', '2006', 4.789362794]]

It is possible to trasform jsonstat data into table in different order

order = [i.did() for i in oecd.dimensions()] order = order[::-1] # reverse list table = oecd.to_table(order=order) table[:5]

[['indicator', 'OECD countries, EU15 and total', '2003-2014', 'Value'], ['unemployment rate', 'Australia', '2003', 5.943826289], ['unemployment rate', 'Austria', '2003', 4.278559338], ['unemployment rate', 'Belgium', '2003', 8.158333333], ['unemployment rate', 'Canada', '2003', 7.594616751]]

Notebook: using jsonstat.py python library with jsonstat format ver-

sion 2.

This Jupyter notebook shows the python library jsonstat.py in action. The JSON-stat is a simple lightweight JSON

dissemination format. For more information about the format see the official site.

In this notebook it is used the data file oecd-canada-col.json from json-stat.org site. This file is compliant to the version

2 of jsonstat. This notebook is equal to version 1. The only difference is the datasource.

all import here

from future import print_function import os import pandas as ps # using panda to convert jsonstat dataset to pandas dataframe import jsonstat # import jsonstat.py package

import matplotlib as plt # for plotting %matplotlib inline

Download or use cached file oecd-canada-col.json. Caching file on disk permits to work off-line and to speed up the

exploration of the data.

url = 'http://json-stat.org/samples/oecd-canada-col.json' file_name = "oecd-canada-col.json"

using already downloaded file /Users/26fe_nas/gioprj.on_mac/prj.python/jsonstat.py/ ˓→tests/fixtures/www.json-stat.org/oecd-canada-col.json

6 Chapter 1. Notebooks

Initialize JsonStatCollection from the file and print the list of dataset contained into the collection.

collection = jsonstat.from_file(file_path) collection

Select the firt dataset. Oecd dataset has three dimensions (concept, area, year), and contains 432 values.

oecd = collection.dataset(0) oecd

oecd.dimension('concept')

oecd.dimension('area')

oecd.dimension('year')

Shows some detailed info about dimensions.

Accessing value in the dataset

Print the value in oecd dataset for area = IT and year = 2012

oecd.data(area='IT', year='2012')

JsonStatValue(idx=201, value=10.55546863, status= None )

oecd.value(area='IT', year='2012')

oecd.value(concept='unemployment rate',area='Australia',year='2004') # 5.

oecd.value(concept='UNR',area='AU',year='2004')

Trasforming dataset into pandas DataFrame

df_oecd = oecd.to_data_frame('year', content='id') df_oecd.head()

df_oecd['area'].describe() # area contains 36 values

count 432 unique 36 top ES freq 12 Name: area, dtype: object

1.2. Notebook: using jsonstat.py python library with jsonstat format version 2. 7

order = [i.did() for i in oecd.dimensions()] order = order[::-1] # reverse list table = oecd.to_table(order=order) table[:5]

Notebook: using jsonstat.py with eurostat api

This Jupyter notebook shows the python library jsonstat.py in action. It shows how to explore dataset downloaded

from a data provider. This notebook uses some datasets from Eurostat. Eurostat provides a rest api to download

its datasets. You can find details about the api here It is possible to use a query builder for discovering the rest api

parameters. The following image shows the query builder:

all import here

from future import print_function import os import pandas as pd import jsonstat

import matplotlib as plt %matplotlib inline

1 - Exploring data with one dimension (time) with size > 1

Following cell downloads a datataset from eurostat. If the file is already downloaded use the copy presents on the disk.

Caching file is useful to avoid downloading dataset every time notebook runs. Caching can speed the development,

and provides consistent results. You can see the raw data here

url_1 = 'http://ec.europa.eu/eurostat/wdds/rest/data/v1.1/json/en/nama_gdp_c? ˓→precision=1&geo=IT&unit=EUR_HAB&indic_na=B1GM' file_name_1 = "eurostat-name_gpd_c-geo_IT.json"

file_path_1 = os.path.abspath(os.path.join("..", "tests", "fixtures", "www.ec.europa. ˓→eu_eurostat", file_name_1)) if os.path.exists(file_path_1): print("using already donwloaded file {}".format(file_path_1)) else : print("download file") jsonstat.download(url_1, file_name_1) file_path_1 = file_name_

using already donwloaded file /Users/26fe_nas/gioprj.on_mac/prj.python/jsonstat.py/ ˓→tests/fixtures/www.ec.europa.eu_eurostat/eurostat-name_gpd_c-geo_IT.json

Initialize JsonStatCollection with eurostat data and print some info about the collection.

1.3. Notebook: using jsonstat.py with eurostat api 9

collection_1 = jsonstat.from_file(file_path_1) collection_

Previous collection contains only a dataset named ‘nama_gdp_c‘

nama_gdp_c_1 = collection_1.dataset('nama_gdp_c') nama_gdp_c_

All dimensions of the dataset ‘nama_gdp_c‘ are of size 1 with exception of time dimension. Let’s explore the time

dimension.

nama_gdp_c_1.dimension('time')

Get value for year 2012.

nama_gdp_c_1.value(time='2012')

Convert the jsonstat data into a pandas dataframe.

df_1 = nama_gdp_c_1.to_data_frame('time', content='id') df_1.tail()

Adding a simple plot

df_1 = df_1.dropna() # remove rows with NaN values df_1.plot(grid= True , figsize=(20,5))

2 - Exploring data with two dimensions (geo, time) with size > 1

Download or use the jsonstat file cached on disk. The cache is used to avoid internet download during the devolopment

to make the things a bit faster. You can see the raw data here

url_2 = 'http://ec.europa.eu/eurostat/wdds/rest/data/v1.1/json/en/nama_gdp_c? ˓→precision=1&geo=IT&geo=FR&unit=EUR_HAB&indic_na=B1GM' file_name_2 = "eurostat-name_gpd_c-geo_IT_FR.json"

file_path_2 = os.path.abspath(os.path.join("..", "tests", "fixtures", "www.ec.europa. ˓→eu_eurostat", file_name_2)) if os.path.exists(file_path_2):

10 Chapter 1. Notebooks

df_4 = nama_gdp_c_2.to_data_frame('time', content='id', blocked_dims={'geo':'IT'}) df_4 = df_4.dropna() df_4.plot(grid= True ,figsize=(20,5))

Notebook: using jsonstat.py to explore ISTAT data (house price in-

dex)

This Jupyter notebook shows how to use jsonstat.py python library to explore Istat data. Istat is Italian National

Institute of Statistics. It publishs a rest api for querying italian statistics.

We starts importing some modules.

from future import print_function import os import istat from IPython.core.display import HTML

Step 1: using istat module to get a jsonstat collection

Following code sets a cache dir where to store json files download by Istat api. Storing file on disk speed up develop-

ment, and assures consistent results over time. Anyway you can delete file to donwload a fresh copy.

cache_dir = os.path.abspath(os.path.join("..", "tmp", "istat_cached")) istat.cache_dir(cache_dir) print("cache_dir is '{}'".format(istat.cache_dir()))

12 Chapter 1. Notebooks

cache_dir is '/Users/26fe_nas/gioprj.on_mac/prj.python/jsonstat.py/tmp/istat_cached'

Using istat api, we can shows the istat areas used to categorize the datasets

istat.areas()

Following code list all datasets contained into area Prices.

istat_area_prices = istat.area('Prices') istat_area_prices.datasets()

List all dimension for dataset DCSP_IPAB (House price index)

istat_dataset_dcsp_ipab = istat_area_prices.dataset('DCSP_IPAB') istat_dataset_dcsp_ipab

Finally from istat dataset we extracts data in jsonstat format by specifying dimensions we are interested.

spec = { "Territory": 1, "Index type": 18,

"Measure": 0, # "Purchases of dwelling": 0, # "Time and frequency": 0

}

convert istat dataset into jsonstat collection and print some info

collection = istat_dataset_dcsp_ipab.getvalues(spec) collection

The previous call is equivalent to call istat api with a “1,18,0,0,0” string of number. Below is the mapping from the

number and dimensions:

dimension

Territory 1 Italy

Type 18 house price index (base 2010=100) - quarterly data’

Measure 0 ALL

Purchase of dwelling 0 ALL

Time and frequency 0 ALL

json_stat_data = istat_dataset_dcsp_ipab.getvalues("1,18,0,0,0") json_stat_data

step2: using jsonstat.py api.

Now we have a jsonstat collection, let expore it with the api of jsonstat.py

Print some info of one dataset contained into the above jsonstat collection

jsonstat_dataset = collection.dataset('IDMISURA1IDTYPPURCHIDTIME') jsonstat_dataset

Print info about the dimensions to get an idea about the data

jsonstat_dataset.dimension('IDMISURA1')

jsonstat_dataset.dimension('IDTYPPURCH')

1.4. Notebook: using jsonstat.py to explore ISTAT data (house price index) 13

from future import print_function import os import pandas as pd from IPython.core.display import HTML import matplotlib.pyplot as plt %matplotlib inline

import istat

Using istat api

Next step is to set a cache dir where to store json files downloaded from Istat. Storing file on disk speeds up develop-

ment, and assures consistent results over time. Eventually, you can delete donwloaded files to get a fresh copy.

cache_dir = os.path.abspath(os.path.join("..", "tmp", "istat_cached")) # you could ˓→choice /tmp istat.cache_dir(cache_dir) print("cache_dir is '{}'".format(istat.cache_dir()))

cache_dir is '/Users/26fe_nas/gioprj.on_mac/prj.python/jsonstat.py/tmp/istat_cached'

List all istat areas

istat.areas()

List all datasets contained into area LAB (Labour)

istat_area_lab = istat.area('LAB') istat_area_lab

List all dimension for dataset DCCV_TAXDISOCCU (Unemployment rate)

istat_dataset_taxdisoccu = istat_area_lab.dataset('DCCV_TAXDISOCCU') istat_dataset_taxdisoccu

Extract data from dataset DCCV_TAXDISOCCU

spec = { "Territory": 0, # 1 Italy "Data type": 6, # (6:'unemployment rate') 'Measure': 1, # 1 : 'percentage values' 'Gender': 3, # 3 total 'Age class':31, # 31:'15-74 years' 'Highest level of education attained': 12, # 12:'total', 'Citizenship': 3, # 3:'total') 'Duration of unemployment': 3, # 3:'total' 'Time and frequency': 0 # All }

convert istat dataset into jsonstat collection and print some info

collection = istat_dataset_taxdisoccu.getvalues(spec) collection

Print some info of the only dataset contained into the above jsonstat collection

1.5. Notebook: using jsonstat.py to explore ISTAT data (unemployment) 15

jsonstat_dataset = collection.dataset(0) jsonstat_dataset

df_all = jsonstat_dataset.to_table(rtype=pd.DataFrame) df_all.head()

df_all.pivot('Territory', 'Time and frequency', 'Value').head()

spec = { "Territory": 1, # 1 Italy "Data type": 6, # (6:'unemployment rate') 'Measure': 1, 'Gender': 3, 'Age class':0, # all classes 'Highest level of education attained': 12, # 12:'total', 'Citizenship': 3, # 3:'total') 'Duration of unemployment': 3, # 3:'total') 'Time and frequency': 0 # All }

convert istat dataset into jsonstat collection and print some info

collection_2 = istat_dataset_taxdisoccu.getvalues(spec) collection_

df = collection_2.dataset(0).to_table(rtype=pd.DataFrame, blocked_dims={'IDCLASETA28': ˓→'31'}) df.head(6)

df = df.dropna() df = df[df['Time and frequency'].str.contains(r'^Q.*')]

df = df.set_index('Time and frequency')

df.head(6)

df.plot(x='Time and frequency',y='Value', figsize=(18,4))

fig = plt.figure(figsize=(18,6)) ax = fig.add_subplot(111) plt.grid( True ) df.plot(x='Time and frequency',y='Value', ax=ax, grid= True )

jsonstat.py Documentation, Exams of Italian

Related documents

Partial preview of the text

Download jsonstat.py Documentation and more Exams Italian in PDF only on Docsity!

jsonstat.py Documentation

Release 0.1.

26fe

Aug 06, 2017

ii

jsonstat.py is a library for reading the JSON-stat data format maintained and promoted by Xavier Badosa. The JSON-

stat format is a JSON format for publishing dataset. JSON-stat is used by several institutions to publish statistical

data.

Contents:

Contents 1

CHAPTER 1

Notebooks

Notebook: using jsonstat.py python library with jsonstat format ver-

sion 1.

This Jupyter notebook shows the python library jsonstat.py in action. The JSON-stat is a simple lightweight JSON

dissemination format. For more information about the format see the official site. This example shows how to explore

the example data file oecd-canada from json-stat.org site. This file is compliant to the version 1 of jsonstat.

all import here

Download or use cached file oecd-canada.json. Caching file on disk permits to work off-line and to speed up the

exploration of the data.

Initialize JsonStatCollection from the file and print the list of dataset contained into the collection.

Select the dataset named oedc. Oecd dataset has three dimensions (concept, area, year), and contains 432 values.

Shows some detailed info about dimensions

Accessing value in the dataset

Print the value in oecd dataset for area = IT and year = 2012

Trasforming dataset into pandas DataFrame

4 Chapter 1. Notebooks

It is possible to trasform jsonstat data into table in different order

This Jupyter notebook shows the python library jsonstat.py in action. The JSON-stat is a simple lightweight JSON

dissemination format. For more information about the format see the official site.

In this notebook it is used the data file oecd-canada-col.json from json-stat.org site. This file is compliant to the version

2 of jsonstat. This notebook is equal to version 1. The only difference is the datasource.

all import here

Download or use cached file oecd-canada-col.json. Caching file on disk permits to work off-line and to speed up the

exploration of the data.

6 Chapter 1. Notebooks

Initialize JsonStatCollection from the file and print the list of dataset contained into the collection.

Select the firt dataset. Oecd dataset has three dimensions (concept, area, year), and contains 432 values.

Shows some detailed info about dimensions.

Accessing value in the dataset

Print the value in oecd dataset for area = IT and year = 2012

Trasforming dataset into pandas DataFrame

1.2. Notebook: using jsonstat.py python library with jsonstat format version 2. 7

This Jupyter notebook shows the python library jsonstat.py in action. It shows how to explore dataset downloaded

from a data provider. This notebook uses some datasets from Eurostat. Eurostat provides a rest api to download

its datasets. You can find details about the api here It is possible to use a query builder for discovering the rest api

parameters. The following image shows the query builder:

all import here

1 - Exploring data with one dimension (time) with size > 1

Following cell downloads a datataset from eurostat. If the file is already downloaded use the copy presents on the disk.

Caching file is useful to avoid downloading dataset every time notebook runs. Caching can speed the development,

and provides consistent results. You can see the raw data here

Initialize JsonStatCollection with eurostat data and print some info about the collection.

1.3. Notebook: using jsonstat.py with eurostat api 9

Previous collection contains only a dataset named ‘nama_gdp_c‘

All dimensions of the dataset ‘nama_gdp_c‘ are of size 1 with exception of time dimension. Let’s explore the time

dimension.

Get value for year 2012.

Convert the jsonstat data into a pandas dataframe.

Adding a simple plot

2 - Exploring data with two dimensions (geo, time) with size > 1

Download or use the jsonstat file cached on disk. The cache is used to avoid internet download during the devolopment

to make the things a bit faster. You can see the raw data here

10 Chapter 1. Notebooks

This Jupyter notebook shows how to use jsonstat.py python library to explore Istat data. Istat is Italian National

Institute of Statistics. It publishs a rest api for querying italian statistics.

We starts importing some modules.

Step 1: using istat module to get a jsonstat collection

Following code sets a cache dir where to store json files download by Istat api. Storing file on disk speed up develop-

ment, and assures consistent results over time. Anyway you can delete file to donwload a fresh copy.

12 Chapter 1. Notebooks

Using istat api, we can shows the istat areas used to categorize the datasets

Following code list all datasets contained into area Prices.

List all dimension for dataset DCSP_IPAB (House price index)

Finally from istat dataset we extracts data in jsonstat format by specifying dimensions we are interested.

"Measure": 0, # "Purchases of dwelling": 0, # "Time and frequency": 0