Docsity
Docsity

Prepara i tuoi esami
Prepara i tuoi esami

Studia grazie alle numerose risorse presenti su Docsity


Ottieni i punti per scaricare
Ottieni i punti per scaricare

Guadagna punti aiutando altri studenti oppure acquistali con un piano Premium


Guide e consigli
Guide e consigli


COMPUTATIONAL TOOLS (excel, PP, PBI), Appunti di Informatica gestionale

basic knowledge in Excel, Power pivot and Power BI

Tipologia: Appunti

2022/2023

In vendita dal 01/06/2023

Appunti-gs
Appunti-gs 🇮🇹

5

(1)

31 documenti

1 / 16

Toggle sidebar

Questa pagina non è visibile nell’anteprima

Non perderti parti importanti!

bg1
GOAL--> how to turn data into information
Excel is not a DATABASE, thus is not a reliable way to store data.
Database--> When we talk about a 'TABLE' we mean a collection of columns or a single column, all adjacent to
each other. Each column is a 'FIELD' , it must have a name (not an empty space) and this name must be
UNIQUE.
In excel when we talk about a table we actually mean a RAIN OF CELLS. As excel is not a database, two
columns can have the same name. However, this raises errors along the way so we must avoid it.
Each column should have a single datatype, could be number, letters.
Cust no
Product
Production plant
Scheduled
ship date
Actual ship date
Unique
identifier
What the
consumer has
purchased
Where the product
has been
manufactured
When the order
was received
When we
promised to
ship
May or may not be
equal to the scheduled
ship date
Invoice date
Oder
quantity
Sales AMT
Unit price
Production serie
Document that states
the transaction
How much
Total amount
payed
Price per unit
Product family/group that summarizes
different products under the same categorie
Each row represents a signle transaction.
To highlight an entire column click on the letter (this highlight even the cells not in the table). Once you do
this, the count figure on the bottom right tells us how many of these cells are non-empty
-
To highlight only the table, not all cells--> ctrl-shift-
. To exclude specific cells shit-
-
F2--> look at the formula contained in the cell
-
Computation--> =SUM(..) If the input is in square brackets is optional. If it is bold it is mandatory.
-
Ctrl-c/v-->you copy and paste the FORMULA
-
Ctrl-c and then select
-
To indenty the different product serie you can simply add the filter
-
weighted averagesum of all prices and quantities / sum of the quantitiesamount you spent / amount you
bought
PIVOT TABLE
It is a tool that summarizes and reorganizes large amounts of data. It makes it easy to create specific reports on
determined data. To create a pivot table all headers must be filled, if not it will create an error. From a pivot table
you can create graphs.
LIMITS in excel:
Fed only by ONE table
-
Max number of row is 1 MILLION (databses do not have this restrinction)
-
When you select the fields:
If the column is numeric it will automatically sum the values
-
If it is text it will put into row
-
The grand total in pivot table is never calculated as a sum of subtotals
You can also create a DOUBLE ENTRY PIVOT, by using both rows and columns. However, it is better to only
have ONE COLUMN and ONE CALCULATION.
Excel automatically deletes rows with empty cells
When you create a pivot table, a copy of the original table is created. Thus, even if you change the values in the
original the pivot table does not change. To realign the copy to the original you have to REFRESH. Refreshing
can be automatized.
COMPUTATIONAL TOOLS
mercoledì 10 maggio 2023
17:53
EXCEL Page 1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Anteprima parziale del testo

Scarica COMPUTATIONAL TOOLS (excel, PP, PBI) e più Appunti in PDF di Informatica gestionale solo su Docsity!

GOAL--> how to turn data into information Excel is not a DATABASE, thus is not a reliable way to store data. Database--> When we talk about a 'TABLE' we mean a collection of columns or a single column, all adjacent to each other. Each column is a 'FIELD' , it must have a name (not an empty space) and this name must be UNIQUE. In excel when we talk about a table we actually mean a RAIN OF CELLS. As excel is not a database, two columns can have the same name. However, this raises errors along the way so we must avoid it. Each column should have a single datatype, could be number, letters. Cust no Product Production plant Order entry Scheduled ship date Actual ship date Unique identifier What the consumer has purchased Where the product has been manufactured When the order was received When we promised to ship May or may not be equal to the scheduled ship date Invoice date Oder quantity Sales AMT Unit price Production serie Document that states the transaction How much Total amount payed Price per unit Product family/group that summarizes different products under the same categorie Each row represents a signle transaction. To highlight an entire column click on the letter (this highlight even the cells not in the table). Once you do this, the count figure on the bottom right tells us how many of these cells are non-empty

  • To highlight only the table, not all cells--> ctrl-shift-. To exclude specific cells shit-
  • F2--> look at the formula contained in the cell
  • Computation--> =SUM(..) If the input is in square brackets is optional. If it is bold it is mandatory.
  • Ctrl-c/v--> you copy and paste the FORMULA
  • Ctrl-c and then select
  • To indenty the different product serie you can simply add the filter weighted average→ sum of all prices and quantities / sum of the quantities→ amount you spent / amount you bought PIVOT TABLE It is a tool that summarizes and reorganizes large amounts of data. It makes it easy to create specific reports on determined data. To create a pivot table all headers must be filled, if not it will create an error. From a pivot table you can create graphs. LIMITS in excel:
  • Fed only by ONE table
  • Max number of row is 1 MILLION (databses do not have this restrinction) When you select the fields:
  • If the column is numeric it will automatically sum the values
  • If it is text it will put into row The grand total in pivot table is never calculated as a sum of subtotals You can also create a DOUBLE ENTRY PIVOT, by using both rows and columns. However, it is better to only have ONE COLUMN and ONE CALCULATION. Excel automatically deletes rows with empty cells When you create a pivot table, a copy of the original table is created. Thus, even if you change the values in the original the pivot table does not change. To realign the copy to the original you have to REFRESH. Refreshing can be automatized.

COMPUTATIONAL TOOLS

mercoledì 10 maggio 2023 17: EXCEL Page 1

  • To focus on only one field it can be used as a FILTER.
  • If you double click on a value in the pivot it will create an entire sheet based on it AT THE GRAND TOTAL WE DO NOT HAVE FILTERS FROM ROWS AND COLUMNS (however, we still have the general filters) Cardinality of a column→ number of rows of a value. cardinality refers to the number of unique values in a relational table column relative to the total number of rows in the table. You can create a graph out of a pivot table. A calculated field is a formula that is created specifically for use in a PivotTable. You can create a calculated field based on standard aggregation functions, such as COUNT or SUM, or by defining your own DAX (data analysis expression )formula. Ex: calculate weighted average cost. In the SALES TABLE, there is only one column dedicated to costumers, such as the identity customer number. If we were to include all info about the customers in the sales table it would become enourmous and inefficient. That is why we create a dedicated table to customers--> CUSTOMER TABLE. In the sales table I only have the keys (cust no) that can be connected to other more specific tables. KEY= common column between different tables VERTICAL LOOKUP is a function that helps you look for a specified value by searching for it vertically across the sheet. =VLOOKUP(cust no.1; customer table; cust co column; drag until the desired column; number of column; FALSE) 8 --> number of column starting from the one we do the search FALSE--> exact match. - The default is TRUE (approximate match) As we created the column in THE MIDDLE of the table, by refreshing the pivot table, this additional column will simply be added. If the column was created on the far right/left, we have to change the data source and drag the additional column in the picture. However, before doing ALL this we must check if the column in the lookuptable is a PRIMARY KEY, so it is a column that does not have any duplicates nor blank cells. If there are multiple equal rows, excell will simply pick the first one with the same key. To know IF the column is primary, i can simply create a pivot table on the product table, use the product column as ROWS. Highlight the table to find the count of rows and check if they correspond with the count of rows in the original table. (this works because a pivot table automatically deletes duplicates and blank rows). However, this does not tell us why. If I use the product field both as ROWS and as VALUES, and then sort the result from largest to smallest. I will know of which products I have more than one. If I then double click on the number, this will create an additional sheet that explains me why. Ex: the same product is produced in different plans Now that I know WHY it is not primary, i can use product and plant together. EXCEL Page 2

We can also test multiple IF CONDITIONS together. If condition 1 is true then add NEG, if it is false then check if condition 2 holds… Tables can be: FACT TABLES--> list of events. It contains the things we want to describe and it is OUT OF these tables that we make our calculations. Usually facts tables have fewer columns as not all characteristics are included, just a key identifier that is linked to a DIMENSION.

Ex: sales tables- event: transaction DIMENSIONS--> list of entities. Ex: customer, product, plant. We look at the data from the fact table from a different prospective/dimension, by focusing on one entity (pivot table). Usually dimension have many columns, each describing a quality of the entity, and fewer rows, as there are no repetition of the same specific entity (on the contrary in the sales table one customer can do multiple orders.) CALENDAR TABLE--> list of dates. By themselves they do not define anything, we must give them a meaning--> date of invoice, shipping… qualities describes in columns could be year, weak, semester. Calendar tables are important because without them we cannot do TIME INTELLIGENCE.

Ex: why is the calendar table useful Now these columns can be used in the pivot table. (americans use FISCAL YEAR--> this is aslo incorporated in the calendar table = FYquarter) What fi I want the suddivision by month? In the sales table I crate the invoice year-month. =YEAR(invoice date)&”-“&MONTH(invoice date). A quicker approach is by using VERTICAL LOOKUP: Check if the column date in the calendar table is a primary key

Vertical look up using the invoice date and the yearmonth column in the calendar table.

EXCEL Page 4

As we said Excel has some limitations: (before we used it as both datasource and analyse)

  • Max 1 million rows
  • Unprecise formulas in calculated fields
  • Not reliable store data How to overcome them? We will use two excels:
  • One as DATASOURCE (normally one should use a database but it is too complicated for now)
  • One will connect to the first one in order to analyse the data First we need to activate POWER PIVOT in excel--> File/options/add-ins/manage/COM add-ins/go/flag powerpivot for excel STEPS IN BUSINESS INTELLIGENCE: ETL (extract-transform-load)--> done by POWER QUERY. It connects to the remote data, cleans them and imports them

DATA MODELLING--> done by POWER PIVOT. We connect them into a data model. (calc columns, measures, calc tables, relationship)

  1. VISUALIZATIONS--> done by POWER PIVOT TABLES
  2. SHARING--> not possible through excel, but thorugh POWER BUSINESS INTELLIGENCE
  3. ANALITICS--> out of PBI HOW TO IMPORT THE DATA FROM THE DATASOURCE In Excel--> Data / get data /select your file We do not want to simply load the data into excel, but in Power Pivot Load to--> If there are problems with data we need to open power query -->Data - query and connections - double click on the sheet where we have the problem Now we can return to power pivot and create a pivot table where we can add data from different tables at once. To do this we must CONNECT the tables--> create RELATIONSHIP, so that the filter we apply to one table applies also to the other. Click manage and diagram view Now I can select Power Pivot and then manage.

DAX

martedì 16 maggio 2023 09:

In the pivot table--> add as slicer. This way you can remove the salesarea from the fields in the pivot table Calculated column--> not a native column (already included in the table) RULES:

  • In values we will no longer drag any column. Only measures written by ourselves In all the other fields (rows, column, filters, slicers), we only put columns and the columns need to come from the dimensions. From the sales table we only generate measures.

Now we can create MEASURES To measure the average selling price, first i need to calculate the ratio between the sum of the two columns I can now change the formulas to see the effects. Ex: sales increase by 30% However, this will not change also the average column, because the measure takes the column from the sales table not from the pivot table, Instead of retipyng the cose, I can reuse the measure created before: total sales and total quantity --> METAMEASURE, measure that works on top of other meaasures. Now if we remove the *1,3, averything will readjust.

By creating simply measure, we can create more complex ones by reusing the first ones. For divisions is better to use: DASHBOARD--> if you share it anyone can use it and see the related data. We can also insert a Chart by going on pivot table analyser Product serie--> first two digit of the field product Average sale per transaction--> total sales / number of transactions The number of transactions corresponds to the number of rows in the sales table. To calculate it: Average sale per customer--> total sales / number of customers I cannot count the rows in the customer table (all customers). I want the count of customers that purchase something. Thus, i want the customers in the sales table. However, here we have customers multiple times. I need i DISTINCT COUNT (counts the different customers - not additive = the value at the grand total is not the sum of the subvalues). DAX Formula evaluated in two contexts:

  • Filter context--> refers to pivot table, measure Row context--> refers to calculated columns. Here we can concatenate columns, because it is automatically considered one rowat a time (iteration)

Scalar expression--> anything that turns into a single value (sum, division, average) Ex: How much i invoice in a single day. How much will i loose if i stay close three days? Average revenue per day--> total sales / number of open days How can we estimate the number of days in which we are open?

Ex: relationship between the number of transaction of each area and the total transaction in percentage. --> num of transaction each area / total num of transaction To calculate the total number of transactions, as each row has applied the filter salesarea, we need to use the table function ALL (the output is stricly the same amount of input rows or MORE). In this case it returns the total amount of rows in the table sales. WHAT IF I WANT ALL TRANSACTIONS OF A SUBSET WITHOUT THE SALESAREA FILTER? Now I can do the division and I will have the percentage. NOW WE SWITCH TO POWER BUSINESS INTELLIGENCE It has the same features of excel + SHARING OPTION. Once we dowload it, we have to modify some optins:

  • GLOBAL - data load - deselect TIME INTELLIGENCE
  • CURRENT FILE - data load - deselect TIME INTELLIGENCE Now I can get the data from the excel file. If something is wrong with the data: clicl transform data and you will enter in Power Query. Here Pivot tables are called MATRIX. We can create Measures I can have more visualizations at once: The two are interactive. Click on one, the other will change. I can create all the measures I need. (this visualization is called CARD) To go to another row--> shift-enter

MAPS

SLICER

The SUM function only accepts one column --> this is because it is not a real DAX function. It is a simpler version of SUMX (scalar function). (x--> explicits to go row by row) Two situations in which we need to explicit SUMX.

  • When i need to sum an expression between two columns (not the sum of a single one) Thus, I can also apply filter:
  • With SUMX you can count the number of rows in a table =SUMX (tables, 1) --> for each row you calculate 1, then sums all 1s together = number of rows COUNTROWS (table) = SUMX (table, 1) It will go row by row in the sales table. For each row it will calculate the product of the unit price column and the quantity column Same number each row Same number each row

AND--> checks several logical condition. First it checks if the age of person 1 is higher than 18. then it will check if the age of the second person meets the condition of being older than 18. It returns TRUE only if ALL its operators are meet the condition. If not it returns FALSE: OR--> checks several logical condition. It returns FALSE only when EVERYTHING is false. If even ONLY ONE condition is tru, it returns true. In excel the number of inputs is not limited In DAX these functions only work with two conditions, we would need to nest(like we did with IF) but the code with be too complex. Thus, we use OPERATORS. Ex of AND: Ex of AND OPERATOR --> && Two conditions:

  • F
  • 2001 If the column of the condition is in the same table as sale I DO NOT NEED RELATED As more conditions you add, the smaller the amounts

AND and OR

venerdì 26 maggio 2023 10:

In this case if I use the OR FUNCTION, it will consider the transactions of 2001 no matter the gender, and it will consider the transaction of females no matter the year. Ex of OR OPERATOR--> || The more conditions you add to an OR STATEMENT, the more data you see.

Mix of AND / OR

Revenues F or (2001 and germany)--> the customer can be Female or BOTH 2001 AND germany The parentesis are needed to group those conditions

HOW TO USE OR

Only when the columns belong to the same table. Ex with gender and english education Classic method with SUMX: OR with calculate: TIME INTELLIGENCE (we put calendarYear in rows) I want to establish the growth from one year to the other. --> (revenues 2002 - 2001) / revenues 2001 --> we need the concept of previous year without explicitly writing it Now I can compute the rate of growth COMULATIVE FUNCTION--> it will add all values up untill that point I CAN MIX THE TWO FUNCTIONS Maybe two compare the YTD of two years I will have a column and for each year it will report the revenues for the previous one Numerator Denominator