These functions are used to combine fields of two tables by matching the values of common attributes (text or numeric). They are among the most powerful and versatile table functions. You can use them:
- For adding new attributes, or attribute combinations into tables (attribution, or attribute enrichment). For example, you can add a unique
product_idinto a table, for each unique
product_name-colorpair (1 to 1 relation). Or you can add a new attribute
categoryinto table, which is mapped each
product_id(1 to N relation). Both cases are demonstrated in the examples below.
- For adding new key figures into tables along with new attributes. For example, you can add a new key figure for average sales per category into a table with product ids and categories.
- Generally, you can add new attributes and key figures into tables for calculating the values of some other key figures through row-by-row processing of table (see table transformation functions). For example, you could add
margininto a table with the key figure
costs, before calculation
pricewhere price = costs x (1 + margin).
Adding key figures of Table2 to Table1
Following conditions must be satisfied in order to use the table function
- Table1 and Table2 must have no common key figures.
- Table2’s attributes must be a subset of Table1’s attributes. That is, Table1 must contain all the attributes of Table2.
- Table2 must have distinct (unique) attribute rows; that is, each row of table must have a different attribute value combination.
In the following example, new key figures
discount from margin table are added to cost table by matching the values of the common attribute
category in both tables.
// default key figure values for unspecified categories var DefaultKeyFigValues = new Dictionary(); DefaultKeyFigValues = 0.22; DefaultKeyFigValues = 0.11; // initiate index vectors for matched rows NumVector MatchedRowsTbl1, MatchedRowsTbl2; // combine key figures MatrixTable CombinedKeyFigTbl1 = MatrixTable.CombineKeyFigures(CostTable, MarginTable1, DefaultKeyFigValues, out MatchedRowsTbl1, out MatchedRowsTbl2); // view table MatrixTable.View_MatrixTable(CombinedKeyFigTbl1, "Combined table; margins & discounts per category")
Note that default margins and discounts (0.22 and 0.11) are assigned to the products from the category Camera in the table above, because margins and discounts for this category were not specified in the margin table.
By row-partitioning the resultant combined table with the row indices returned by the function, you can get the sub-table with matched rows only (i.e. rows with categories that also exist in margin table):
// view subtable with matched rows only MatrixTable.View_MatrixTable( MatrixTable.PartitionRow(CombinedKeyFigTbl1, MatchedRowsTbl1), "Combined table; margins & discounts per category - matched rows only")
In the following example, margins and discounts are specified in more detail, namely per
year. In this case, these fields (product and year) are common attributes of the tables to be combined.
// default key figure values for unspecified categories DefaultKeyFigValues = 0.22; DefaultKeyFigValues = 0.11; // combine key figures MatrixTable CombinedKeyFigTbl2 = MatrixTable.CombineKeyFigures(CostTable, MarginTable2, DefaultKeyFigValues, out MatchedRowsTbl1, out MatchedRowsTbl2); // view table MatrixTable.View_MatrixTable(CombinedKeyFigTbl2, "Combined table; margins & discounts per product and year")
Note that default margin and discount values (0.22 and 0.11) are assigned to the row with the attribute combination Toshiba-2008, because this pair was not specified in the second margin table above.
Combining fields of two tables
CombineTables is a generalized version of the table function
CombineKeyFigures explained above (seen the other way around, CombineKeyFigures is a special case of CombineTables. Or seen the other way around, CombineKeyFigures is a special case of CombineTables.
Note that both table functions are not commutative; that is, f(Table1, Table2) is not the same as f(Table2, Table1). The output table contains all the fields and rows of the first operand, plus additional fields from the second operand.
CombineTables you can combine not only the key figures, but all fields including the attributes (text and numeric) of tables, by matching the values of common attributes in each row. Following conditions must be satisfied in order to use this table function:
- Table1 and Table2 must have at least one common attribute; numeric or text
- Table1 and Table2 must have no common key figures
- Table2 must have unique attribute rows (unique table condition)
Following example shows:
- How the independent attribute pair
colorin the cost table above can be mapped to a unique
- How new key figures
discountcan be added to cost table along with the text attribute
New margin table below: Margins and discounts per product id; the attribute pair
product_name-color is mapped to a unique
// combine tables: Add unshared fields of MarginTable3 to CostTable2 // no default values for unshared fields are specified explicitly for unmatched rows. var CombinedTbl2 = MatrixTable.CombineTables(CostTable2, MarginTable3, null, null, null, out MatchedRowsTbl1, out MatchedRowsTbl2); // view table MatrixTable.View_MatrixTable(CombinedTbl2, "Combined table 2: Margins/Discounts/Costs per ProductID")
Let’s add product categories to the combined cost table above. For that purpose, we will use the table below as input which maps each product id to a product category.
// default category for product ids whose categories are not specified var DefaultTextAttribValues = new Dictionary(); DefaultTextAttribValues = "Undefined"; // combine tables var CombinedTbl3 = MatrixTable.CombineTables(CombinedTbl2, CategoryMapTable, DefaultTextAttribValues, null, null, out MatchedRowsTbl1, out MatchedRowsTbl2); // view table MatrixTable.View_MatrixTable(CombinedTbl3, "Combined table 3: Category is added to table")
Notice that Undefined is assigned to product id BLC3 (blue camera) as category because a category for this product id was not specified in the category map table.
Now, assume that we have average sales per category in another table as shown below. Following example illustrates how this additional information can be added to the combined cost table above.
// combine tables var CombinedTbl4 = MatrixTable.CombineTables(CombinedTbl3, AvgSalesTable, null, null, null, out MatchedRowsTbl1, out MatchedRowsTbl2); // view table MatrixTable.View_MatrixTable(CombinedTbl4, "Combined table 4: Average sales per category added to table")
Note that the default value 0 is assigned to the category Undefined as it was not specified in the sales table above.
Copyright secured by Digiprove © 2012 Tunc Ali Kütükcüoglu
There are a number of ways to do this, depending on what you really want. With no common columns, you need to decide whether you want to introduce a common column or get the product.
Let’s say you have the two tables:
parts: custs: +----+----------+ +-----+------+ | id | desc | | id | name | +----+----------+ +-----+------+ | 1 | Sprocket | | 100 | Bob | | 2 | Flange | | 101 | Paul | +----+----------+ +-----+------+
Forget the actual columns since you’d most likely have a customer/order/part relationship in this case; I’ve just used those columns to illustrate the ways to do it.
A cartesian product will match every row in the first table with every row in the second:
> select * from parts, custs; id desc id name -- ---- --- ---- 1 Sprocket 101 Bob 1 Sprocket 102 Paul 2 Flange 101 Bob 2 Flange 102 Paul
That’s probably not what you want since 1000 parts and 100 customers would result in 100,000 rows with lots of duplicated information.
Alternatively, you can use a union to just output the data, though not side-by-side (you’ll need to make sure column types are compatible between the two selects, either by making the table columns compatible or coercing them in the select):
> select id as pid, desc, '' as cid, '' as name from parts union select '' as pid, '' as desc, id as cid, name from custs; pid desc cid name --- ---- --- ---- 101 Bob 102 Paul 1 Sprocket 2 Flange
In some databases, you can use a rowid/rownum column or pseudo-column to match records side-by-side, such as:
id desc id name -- ---- --- ---- 1 Sprocket 101 Bob 2 Flange 101 Bob
The code would be something like:
select a.id, a.desc, b.id, b.name from parts a, custs b where a.rownum = b.rownum;
It’s still like a cartesian product but the
where clause limits how the rows are combined to form the results (so not a cartesian product at all, really).
I haven’t tested that SQL for this since it’s one of the limitations of my DBMS of choice, and rightly so, I don’t believe it’s ever needed in a properly thought-out schema. Since SQL doesn’t guarantee the order in which it produces data, the matching can change every time you do the query unless you have a specific relationship or
order by clause.
I think the ideal thing to do would be to add a column to both tables specifying what the relationship is. If there’s no real relationship, then you probably have no business in trying to put them side-by-side with SQL.
If you just want them displayed side-by-side in a report or on a web page (two examples), the right tool to do that is whatever generates your report or web page, coupled with two independent SQL queries to get the two unrelated tables. For example, a two-column grid in BIRT (or Crystal or Jasper) each with a separate data table, or a HTML two column table (or CSS) each with a separate data table.
Below is the test result:
Source = Folder.Files(“C:……”),
#”Filtered Rows” = Table.SelectRows(Source, each = “File1.csv” or = “
.csv” or = “
.csv” or = “
#”Combined Binaries” = Binary.Combine(#”Filtered Rows”),
#”Imported CSV” = Csv.Document(#”Combined Binaries”,),
#”Promoted Headers” = Table.PromoteHeaders(#”Imported CSV”),
#”Filtered Rows1″ = Table.SelectRows(#”Promoted Headers”, each ( “P_PERIOD”))
Total 667287 rows. Takes around 15s to refresh the query.
Total 667287 rows. Also takes about 15s to refresh the query.
Many times in a relational database the information you want to show in your query is in more than one table. This begs the question “How do you combine results from more than one table?”
All the examples for this lesson are based on Microsoft SQL Server Management Studio and the AdventureWorks2012 database. You can get started using these free tools using my Guide Getting Started Using SQL Server.
SQL wouldn’t be a very useful language if it didn’t provide an easy means for you to combine results from more than one query. Fortunately there are three main ways you can combine data from multiple tables. We’ll go over these briefly here and provide links to more in-depth articles.
Three Main Ways to Combine Data
Data in relational database tables are organized into rows and columns. As we investigate ways to combine data, keep in mind that the end result will be to either add more columns to a result, perhaps from another relate table, or rows, by taking a set of rows from two or more tables.
When most people learn to combine data they learn about:
- JOIN – You can use joins to combine columns from one or more queries into one result.
- UNION – Use Unions and other set operators to combine rows from one or more queries into one result.
- Sub Queries – Sometimes called nested queries, these can be used to perform a separate search in the database showed results can be used in another query.
I like to think of joins as the glue that put the database back together. Relational databases are usually normalized to make the data easier to maintain and to improve performance, but the end result is information is separated into many tables. You can use Joins to recombine that information back together into a more human readable format. The data is recombined by matching columns from each table.
In all cases, joins require two main ingredients: Two tables and a join condition. The tables are what we will use to pull the rows and columns and the join condition is how we intend on matching the columns between tables.
SELECT Person.FirstName, Person.LastName, PersonPhone.PhoneNumber FROM Person.Person INNER JOIN Person.PersonPhone ON Person.BusinessEntityID = PersonPhone.BusinessEntityID
There are two main types of joins. Inner Joins and Outer Joins.
Inner Joins only return a resulting row if the join condition matches in both tables. Inner joins are mainly used to match the primary key of one table a foreign key in another.
The second type of join is an outer join. Outer joins always return at least one row for the main table, referred to as the Left or Right table, and null values in the corresponding columns of the non-matching column. Outer joins are useful for finding non-matching data.
It is important to note that joins can return more rows than exist in either table combined. The joins return combinations of matches. If you join two tables, on containing 5 row, and the other 10, the result may contain anywhere from 0 to 50 rows depending on the join condition.
An UNION is used to combine the rows of two or more queries into one result. Union is called a set operator.
There are some special conditions that must occur in order for a union to work. First each query must have the same number of columns. Second, the data types of these columns must be compatible. Generally speaking, each query must return the same number and type of columns.
A practical example of union is when two tables contain part numbers and you want to create a combine list for a catalogue. You can either elect to have the end result be a unique listing for the combine query or if you use UNION ALL return all rows from each table.
SELECT C.Name FROM Production.ProductCategory AS C UNION SELECT S.Name FROM Production.ProductSubcategory AS S
In addition to Union there are a couple of other handy set operators:
- INTERSECT – You can use this to only return row that are common between two tables.
- EXCEPT – You can use this to return rows that exist on one table, but aren’t found in another.
As you go on to learn more SQL you find that you can use joins to write equivalent statements for Intersect, and Except, but there are no equivalents for Union.
Sub queries are sometimes called nested queries. They are queries defined inside of other queries. Sub queries can be confusing. I think a lot of this stems for the fact they can be used in many places in a SQL select statement, and for several purposes!
For example, here are some areas you may see a sub query:
- SELECT clause – Used to return a value. For instance if you’re querying a sales table, you could include the total sales by return a sum of all sales from within a sub query.
- WHERE clause – Sub queries can be used in the where clause in comparisons. You could set up a comparison to compare sales to the overall average. The overall average would be returned from a sub query. You can also use sub queries in membership operators such as IN. Rather than hard-coding the in clause you can use a sub query to make it more dynamic.
- HAVING clause – A single value from a sub query is included in the HAVING clause comparisons.
Example Sub query
SELECT SalesOrderID, LineTotal, (SELECT AVG(LineTotal) FROM Sales.SalesOrderDetail) AS AverageLineTotal FROM Sales.SalesOrderDetail
When used in select clauses and comparison operators such as equals, greater than, and less than, a sub query can only return one row. If used in conjunction with a membership operator, such as IN, it is OK for the query to return one or more rows.