Loading...
Showing posts with label adventureworks. Show all posts
Showing posts with label adventureworks. Show all posts

Tuesday, December 10, 2013

Importing Data into a Database Engine, SQL Server 2012

Hello Readers!

Welcome back. We need data to be able to query tables. So today we will walk through how to import data into a Database Engine in SQL Server Management Studio 2012. Click here for more info on SSMS 2012. 

Let us get started!


The Setup


First, open SQL Server Management Studio (SSMS). We are greeted with a dialogue box to connect to a server. I will use Windows Authentication to connect to my local server. 


Connect to a Server
After we have connected, we see the Object Explorer to the left of the screen. The Server will show a green arrow to indicate a successful connection. The Object Explorer allows us to view and manipulate the different utilities, tools, and capabilities in SSMS. The expanded explorer box is shown below. Note the green 'good to go' arrow by the server name.


Connection Established


Importing the Data


Now we need to import the data. It would be helpful if we had the target data already on the computer, and in this case, we will be importing the familiar AdventureWorks database which exists as an Access database file. We can also import it as an mdf (mirror disk file) from this link.

Start by right clicking the database to where the data will be imported. Now from Tasks, click Import Data... towards the bottom.


Import Data Button

In step, we are in the SQL Server Import and Export Wizard. As shown below. Select the appropriate data source and locate the file on the hard drive:

Select the Data Source and Location


After choosing the input specifications, we need to choose the output Destination. I will import the data into the POWER (short for PowerPivot) database.

Choosing the Destination

And then for the nitty-gritty of specifying which tables to copy. We have two options: manual selection or SQL query. Select either, I will select manual because I will be copying all the data.


Specify Which Selection Method

After which, we can check the boxes of the tables we require. Clicking the box at the top left will select all of the tables. Click Next.

Selecting Tables

We are then given a review of the selected tables before they are imported into the database. The table attributes and types are shown to verify the correct tables have been selected, below.

Data Type Review
Clicking Next and Finish, we now have the tables in the POWER database. The Object Explorer below reflects the newly imported tables.

Object Explorer with New Tables

And with the new query window open, we can now query tables that we require!

Blank Query Window

That concludes this post on how to import data into a database in SQL Server Management Studio 2012. Future posts will include SQL querying and use of the SSMS Analysis Services to analyze the data. Also, I will include a post on using R to connect to SQL Server to retrieve tables. Please look forward to the new posts!


Thanks for reading!


Wayne

Friday, August 16, 2013

P2.1: Loading Data from External Sources to PowerPivot

This post will describe how to load external data sources to PowerPivot, using examples from Chapter 2 of the Ferrari PowerPivot book.


Loading Data PowerPivot

In the previous posts, we have been adding tables to the data model from Excel. However this method is inefficient, as PowerPivot can load data directly into the data model, without going through Excel. By doing so, we only need to load the data once, and it loads as compressed data into memory. When loading, Excel does not compress data so the resulting size of the file is around 10 times larger at times when loaded through Excel rather than PowerPivot (sizes from previous posts). Also, Excel is limited to storing 1 million rows, and PowerPivot can handle hundreds of millions of rows in memory. So loading data into first into Excel then into PowerPivot can hit the limit of rows and curb efficiency.

However, the data in the data model cannot be edited in PowerPivot, and can be edited in Excel. This can help prevent unwanted modification to the data set in PowerPivot as it is read-only. If the data set is loaded into Excel, all Excel editing features are available. Despite being read-only in PowerPivot, it is possible to add new columns to the original tables by using DAX language, but the original tables stay the same.



Loading from the AdventureWorks Database

In the PowerPivot tab, click on the Manage button to open the PowerPivot window.


Fig. 1: Manage Data Model button
The PowerPivot window should appear, and to add data, click on the Get External Data button on the Home tab.


Fig. 2: Get External Data Button in PowerPivot
For this specific demo, we will use the database that was included in the companion material as a MS Access file. Choose the Access connection and the Table Import Wizard window will appear. Find the Access database file where it was saved by selecting Browse and click the Next button.


Fig. 3: Access Database Table Import Wizard
For accessing a database from Microsoft SQL Server, click the SQL Server button. Fill in the Server Name and use the proper Authentication. 


Fig. 4: Microsoft SQL Server Table Import Wizard
The next window appears allowing two options to import the data: select the desired tables from a list, or write a SQL query to specify the data. For simplicity, select the first option to select from a list.


Fig. 5: Choosing How to Import the Data Window
Now we can select the four tables required: DimCustomer, DimDate, DimProduct, and FactInternetSales from the Table Import Wizard.


Fig. 6: Selecting the Tables
After clicking Finish, the tables are loaded into the data model. Select Diagram View in the bottom right-hand corner of the PowerPivot window to see how the tables relate to another. Relationships were explained in P1.2. The FactInternetSales table contains many key fields, as diagrammed below.


Fig. 7: PowerPivot Diagram View

Under the Design tab, clicking Manage Relationships will allow us to modify relationships between tables. The current relationships between the four tables in our data model are shown below in Figure 8.


Fig. 8: Manage Relationships Window in the Data Model


Creating a PivotTable in PowerPivot

After we have loaded the data into PowerPivot, we can create a PivotTable from the data model. Select the PivotTable button under the Home tab, highlighted in Figure 9. 


Fig. 9: PivotTable Button in PowerPivot
A dialogue box will appear (Figure 10) asking where to insert the new PivotTable, and we will create it in a New Worksheet. Since we are adding the PivotTable from the data model in PowerPivot, we do not have to specify from which data model we would like to create.


Fig. 10: Determining the Location of the New PivotTable
Clicking OK will complete the PivotTable creation process.


Fig. 10: Blank PivotTable Created from PowerPivot
Now working within the blank PivotTable (Figure 11), we can select the fields from the tables and place them in a defined quadrant:

  • Color from the DimProduct table to Rows quadrant,
  • CalenderYear from the DimDate table to the Columns quadrant,
  • SalesAmount from the FactInternetSales table to the Values quadrant.

Fig. 11: SalesAmount by Color by CalendarYear PivotTable
This shows how easily we can aggregate and slice values from different tables in a report for the SalesAmount, thanks to relationships which link the tables together.

In the next post we will create calculated columns and calculated fields into the PivotTable we just realized.


Thanks for reading,
Wayne

Sunday, August 4, 2013

P1.5: Building a Dashboard Report Using PivotTables- Excel 2013

In this post you will learn how to create a dashboard displaying Total Sales, percentage Growth, and ratio of Internet and Reseller sales using PivotTables in Excel 2013. The data is from the Ferrari PowerPivot book.


The Dashboard

A dashboard is a report which displays final results in a compact form. To begin, open the Dashboard workbook from Chapter 1 of the PowerPivot book we have been using. You will observe that this particular dashboard displays Total Sales of the AdventureWorks company. The Total Sales is arranged by country by region, along with percentage Growth of sales, and the percentage of Internet and Reseller sales. With the summarized information in a compact report, the dashboard allows the reader to visualize patterns and to assess the data more comprehensively than viewing a single table.


Fig. 1: Dashboard of Total Sales, Growth of Sales, and Internet-Reseller Sales by Country
By looking at Figure 1, you can observe the complexity in the types of data within the column sections of the dashboard. Creating this condensed report with computations and formatting will require multiple PivotTables or other intermediate tables to import the data to modify on the dashboard for the following reasons:
  • The regions within the Country are geographically sliced. So if there is only one region, then that region is collapsed and removed, leaving only the Country. A PivotTable cannot handle slicing and removing for only specific rows.
  • Column headers slice the headings differently and include different sets of Years, as with two years you can determine the growth percentage between the two. Also, Internet and Resell are from where the products were sold, different than years. Again, mixed slicing cannot be achieved with a PivotTable.
  • The data elements inside the cells are calculated differently. They vary from summation for Total Sales, percentage of sales Growth, and ratio of Internet and Reseller sales. The elements in the dashboard are from different calculations, not available in a PivotTable.
  • The colors in the dashboard allows the viewer to assess quickly the patterns in the data. They are derived from conditional formatting, different formulas for different cells, and are unable to be created with a PivotTable.
So we create a dashboard with data inputted from different PivotTables to devise a dashboard displaying different data requiring different formulas.


Creating the PivotTables

We start by querying the Data Model in the workbook with PivotTables. Because PivotTables have a dynamic quality, the PivotTables can change when the source data is updated with new data. For example, adding new entries or fields can change the positional coordinates of the specific cell in a PivotTable referenced by the Dashboard. Therefore, instead of stopping at PivotTables, Excel tables will be used to construct the Dashboard. By referencing Excel tables instead of PivotTables, there will be no need to change the formulas for the Dashboard every time the data is modified. 

The Dashboard worksheet already has the Data Model created. To add a PivotTable from the Data Model present in the workbook, click the Insert tab and then the PivotTable button as shown in Post 1.3. To access the Data Model, select Choose Connection under Use an external data source and the Existing Connections window will appear. The Data Model in the workbook is under the Tables tab as shown in Figure 2.



Fig. 2: Creating a PivotTable from a Data Model
Now we have a blank PivotTable of the Data Model, from which we can select fields to display in a table. To make the Internet PivotTable, select the fields SalesTerritoryGroup, SalesTerritoyCountry, SalesTerritoryRegion, FiscalYear, and SalesAmount from the DimSalesTerritory, DimTime, and FactInternetSales tables. Place the fields into the quadrants as shown below in Figure 3.


Fig. 3: PivotTable Fields for Internet Sales

For the Resellers PivotTable, you also select the same fields except the SalesAmount field is from the FactResellerSales table. The fields are placed in the same areas below the field selection area. Now you have created the PivotTables for Internet and Reseller Sales, depicted in Figures 4 and 5, respectively. 



Fig. 4: Internet Sales PivotTable


Fig. 5: Reseller Sales PivotTable
Converting PivotTables to Excel tables

To convert the PivotTables to 'regular' Excel tables, use the Convert to Formulas button under the OLAP Tools drop down button located in the Analyze tab. Do this for both Internet Sales and Resellers Sales tables.



Fig. 6: Convert to Formulas Button

After converting both the PivotTables into Excel tables, you know have tables you can use to reference for the Dashboard. The two finished Excel tables are shown in the Figures 7 and 8 below. 



Fig. 7: Reseller Sales Table


Fig. 8: Internet Sales Table.

Creating the Dashboard

Now that we have the two Excel tables, we can create the Dashboard, using standard Excel formulas and certain formatting for cells. Start by copying the row labels to the new empty worksheet which will contain the Dashboard. Next label the column headings. You can add color formatting to distinguish the labels from the data as needed.


Fig. 9: Dashboard Labels
Notice in the Internet and Reseller Sales tables that they do have missing values in some cells. When converting the PivotTable into Excel tables, the empty cells in PivotTables return as empty string cells into Excel tables. Therefore we have to use IF() statements in the Dashboard to return numerical zeros when the cell is an empty string for the Total Sales when adding Internet and Reseller sales. 


Fig. 10: IF() formula for Total Sales in Europe in 2006
For the percentage in Growth columns, you can use the mathematical formula to find the difference between the two desired years and divide by the base year for the percentage.


Fig. 11: Formula for Total Sales Growth in 2007 from 2006
By dividing the sales of the Internet by Total Sales, you can calculate the percentage of sales from Internet sales (and Reseller sales when you divide Reseller sales by Total Sales) in each year.


Fig. 12: Formula for Percentage
The color formatting for the cells can be achieved through conditional formatting where you can set certain values in cells to have certain cell colors. You can format the cells by clicking on the Conditional Formatting button shown below in Figure 13.


Fig. 13: Conditional Formatting Button
Dashboards are indispensable for conveying important data and information in a condensed report. Patterns from conditional formatting in Dashboards can aid in visual analysis of the data. As you can see in Figure 13, some Regions in the United States and Australia show poor growth in red in 2007. Also, the distribution of Internet and Reseller sales in lean towards Resellers in North America, whereas in Europe they are distributed more evenly after the years. 

Now you have created a Dashboard containing information on Total Sales, percentage Growth in Sales, and percentage of Internet and Reseller sales by year and Country-Region. 

Thanks for reading,
Wayne

Wednesday, July 24, 2013

P1.4: Working Within PowerPivot- Excel 2013

Hello Reader. Here we explore the data model using PowerPivot in Excel 2013 with the Chapter 1 AdventureWorks workbook from the Ferrari PowerPivot book.


Working in PowerPivot

So far we have used Excel to create and manage PivotTables and data models. To utilize the advanced features of PowerPivot, we will interact with the data model through PowerPivot. Begin by opening the worksheet with the data model made in Post 2, and in the PowerPivot tab, click on Manage Data Model.


Fig. 1: Manage Data Model Button under PowerPivot Tab
A PowerPivot window appears, showing the Data View of the data model with the loaded tables in the model highlighted on the bottom left of Figure 2. You can browse through the elements in the tables of the data model in this view.


Fig. 2: PowerPivot Data View
To understand the data model from a graphical point of view, click the Diagram View button at the top of Figure 2. There is also a toggle button at the lower right hand corner to toggle between views. The Diagram View displays the tables and their relationships so you can clearly assess which tables are linked together. In Figure 3, relationship arrow starts at the source table, Sales, and ends at SalesManagers, the target table.


Fig. 2: Diagram View

Clicking on the arrow will highlight the the tables linked through the relationship and also the field within the tables which links them together.


Fig. 4: Highlighted Relationship of Tables

Later, you will learn how to modify the data model using the Diagram View in PowerPivot.

Thanks for reading,
Wayne

Friday, July 19, 2013

P1.1: VLOOKUP and PivotTables in PowerPivot- Excel 2013

This post describes how to add data from other sheets using VLOOKUP, and make a PivotTable in Excel 2013 from the PivotTable workbook in Chapter 1 of the PowerPivot book.

VLOOKUP 

By opening the PivotTable workbook in Excel you can observe the relevant data for the fictional products of the example company, AdventureWorks. Figure 1 shows a segment of the data.
Fig. 1: AdventureWorks Sales data
The SalesManager column (B) was added to the Sales sheet using the VLOOKUP function (Figure 2) from the data in the SalesManagers sheet (Figure 3). 
=VLOOKUP([@ProductCategory], SalesManagers, 2)
The above VLOOKUP function matches the elements of the ProductCategory field of the current (Sales) sheet to first column of the SalesManager sheet, and retrieves the element of the corresponding second column (SalesManager) of the SalesManagers sheet. So, the Sales Manager of a specific product category shown on the SalesManagers sheet will be added to the respective category of product on the Sales sheet.

Fig. 2: VLOOKUP adding Sales Manager column in Sales sheet
Fig. 3: SalesManagers sheet
Although VLOOKUP is useful, it will be difficult and cumbersome if you want to analyze additional variables by adding them one by one. Relationships and Excel data models will be explored in the next post which will allow complete table analysis (not just one column at a time), and multi-table analysis (many columns of many tables).


PivotTables


This information is stored as an Excel table. To make a PivotTable, which allows analysis of many fields, and later can be linked together into a data model, Excel requires PowerPivot to be enabled in Excel 2013, and downloaded and installed for Excel 2010. (Click here on how to enable PowerPivot in Excel versions 2013 or 2010.) When PowerPivot is enabled, you should see the PowerPivot displayed in Excel's ribbon tab at the top right (Figure 4).

Fig. 4: PowerPivot tab in Excel 2013
To add a PivotTable of the AdventureWorks data, you click the Insert tab and then select the PivotTable button on the ribbon. An option window pops up, so you can select what data to analyze, and where to put the new PivotTable, as shown in Figure 5.

Fig. 5: PivotTable options
After you click OK, the blank PivotTable appears (Figure 6), ready for fields to be added to the report.
Fig. 6: Blank PivotTable with Fields on right side
From the right hand side in the PivotTable Fields, you can select the desired items to analyze and drag them to the appropriate quadrant in the bottom right corner, below the PivotTable Fields list. 

To produce a report regarding the total sales from 2005-2008 of categorized products sold under all sales managers, you select the ProductCategory, SalesManager, Year, and SalesAmount fields and drag them to the areas of rows for the first two, columns and values for the last two, as shown below in Figure 7.


Fig. 7: PivotTable with Sales of Products by Sales Managers by Year

There you have your PivotTable.

Thanks for reading,
Wayne