Random Findings as a Developer

September 14, 2013

Python – Quick Webserver

Filed under: Tools — Andrew @ 6:43 pm

I haven’t really looked into Python very much however, today I found something super useful.  If you need a simple web server set up to do some testing, you can just use the one that is build into Python!

In Terminal, change your directory into the folder that you want to be the web server’s main page and then type:

python -m SimpleHTTPServer

After it starts running, you will see the following:

Serving HTTP on port 8000 ...

After this, you are able to connect to your web server through your web browser using the specified port (in my example I could type http://localhost:8000 in order to see the web server).

September 5, 2013


Filed under: SQL Server — Andrew @ 11:20 pm

Every once in a while, I like to go back to some of the material that I learned back when studying for the SQL Server 2008 exam and play with some of the neat features that they mentioned which are not used very frequently.

Today was one of those days. I found a case when I was able to use “EXCEPT” in a query. In short, we needed to emit data as NULL even if there wasn’t any data coming back (instead of just ignoring the row). So, in order to do that we took all of the keys that we had from our live table, and then inserted any of the ones that were missing.

What do they do?

EXCEPT: returns any distinct values from the left query that are not also found on the right query.

INTERSECT: returns any distinct values that are returned by both the query on the left and right sides of the INTERSECT operand.

Please Note: There are two requirements for using these: all data types that are being compared need to be compatible, the number of columns must match between the two queries.

Example (see below for table outputs):

-- Insert some dummy data in
INSERT INTO @Foo ( Bar, FooBar )
	VALUES ( 'asdf', '12345' ), ( 'asdf123', 'asdf3' ), ( '123s', 'iuhi' ), ( '123sf', 'asdf' )
-- Copy 2 random rows from the top table (order doesn't matter)
INSERT INTO @Bar ( Foo, FooBar )
	SELECT TOP 2 Bar, FooBar 
	FROM @Foo
-- Add in a distinct value for the @Bar table
INSERT INTO @Bar ( Foo, FooBar )
	VALUES ( '12355', 'asdfasd' )
-- Emit the two tables
SELECT Bar, FooBar
SELECT Foo, FooBar
-- Emit everything that is in the @Foo table but not in the @Bar table
SELECT Bar, FooBar
SELECT Foo, FooBar
-- Emit everything that is in both the @Foo and @Bar table
SELECT Bar, FooBar
SELECT Foo, FooBar

SQL Server EXCEPT and INTERSECT example

Yes you are able to do the exact same operation with a JOIN/LEFT JOIN but this is a neat little trick that you can use when you don’t want to bother thinking about the joins (when testing a query).

Reference: http://technet.microsoft.com/en-us/library/ms188055(v=sql.90).aspx

September 4, 2013

Sites that allow you to format code/XML cleanly with the press of a button

Filed under: Development,Tools — Andrew @ 9:39 pm

Recently I had to go through a process of debugging some dynamic SQL that was based off of some XML string that was being provided to a stored procedure. Because both of these were generated dynamically, the code ended up being all on one line, which made it virtually impossible to debug (the string was 1000+ characters long).

In order to make it easier, I found some tools online that generated clean code.

Hopefully these tools will help in the future.

SQL Formatter: http://sqlformat.appspot.com
XML Formatter: http://www.freeformatter.com/xml-formatter.html
JavaScript Formatter: http://jsbeautifier.org

September 3, 2013

When was that index last used?

Filed under: SQL Server — Andrew @ 4:29 pm

Recently I started wondering if some of the indexes that we build on our live tables are actually being hit by our queries, or are we just building them for nothing?

Below is a query that will help you to figure this information out.

-- NULL if you want all tables, or a full table name if you want it filtered
SELECT TableName = SCHEMA_NAME(t.schema_id) + '.' + t.name
		, IndexName = i.name
		, FirstUserLookup = MIN(iu.last_user_lookup)
		, LastUserLookup = MAX(iu.last_user_lookup)
		, AverageNumberOfUserLookups = AVG(iu.user_lookups)
		, FirstUserScan = MIN(iu.last_user_scan)
		, LastUserScan = MAX(iu.last_user_scan)
		, AverageNumberOfUserScans = AVG(iu.user_scans)
		, FirstUserSeek = MIN(iu.last_user_seek)
		, LastUserSeek = MAX(iu.last_user_seek)
		, AverageNumberOfUserSeeks = AVG(iu.user_seeks)
		, FirstUserUpdate = MIN(iu.last_user_update)
		, LastUserUpdate = MAX(iu.last_user_update)
		, AverageNumberOfUserUpdates = AVG(iu.user_updates)
		, FirstSystemLookup = MIN(iu.last_system_lookup)
		, LastSystemLookup = MAX(iu.last_system_lookup)
		, AverageNumberOfSystemLookup = AVG(iu.system_lookups)
		, FirstSystemScan = MIN(iu.last_system_scan)
		, LastSystemScan = MAX(iu.last_system_scan)
		, AverageNumberOfSystemScans = AVG(iu.system_scans)
		, FirstSystemSeek = MIN(iu.last_system_seek)
		, LastSystemSeek = MAX(iu.last_system_seek)
		, AverageNumberOfSystemSeeks = AVG(iu.system_seeks)
		, FirstSystemUpdate = MIN(iu.last_system_update)
		, LastSystemUpdate = MAX(iu.last_system_update)
		, AverageNumberOfSystemUpdates = AVG(iu.system_updates)
FROM sys.indexes AS i
JOIN sys.TABLES AS t ON i.object_id = t.object_id
JOIN sys.dm_db_index_usage_stats AS iu ON i.index_id = iu.index_id
WHERE @TableName IS NULL OR OBJECT_ID(@TableName) = t.object_id
GROUP BY t.name, i.name, t.schema_id
ORDER BY t.name, SCHEMA_NAME(t.schema_id), i.name

September 2, 2013

Columnstore Index Quirks

Filed under: SQL Server — Andrew @ 9:57 pm

New with SQL Server 2012, they have now have columnstore indexes.  In short, it is an index that stores everything in memory and does a quick reduction of any of the pages that it actually needs to read.

However, there are a few quirks that you need to be aware of when using it.

Some that we have found are as follows:

  • Multiple COUNT(DISTINCT) within the same query will force the engine to go into row processing mode and become slow
  • Mixing COUNT(DISTINCT) with other aggregations (i.e. SUM()) is slower than taking 2 trips to your fact table
  • Having multiple aggregations in a query (other than COUNT(DISTINCT) will only affect performance because it is pulling more into memory – not a big hit)

Below are a few queries that we currently have and their respective execution times.

Please Note: All of these should return the same data set.

Below are some of our findings with their corresponding queries.  Hopefully you will find these as interesting as I did J

I am running the data into temp tables that are all created equally and everything was done in a single batch so that all of the estimated costs per batch will be relative to everything else.

Test Case 1: SUM and COUNT(DISTINCT) in the same query

	DepartmentKey SMALLINT
	, Dollars MONEY
	, ProductCount INT
-- How we normally code the counts
SELECT DepartmentKey, SUM(Dollars), COUNT(DISTINCT ProductKey)
FROM FactSalesTimeframeDenormalised AS f
WHERE TimeFrameKey = 1
GROUP BY DepartmentKey

Execution Information:

Execution Information - SUM and COUNT(DISTINCT)

 Execution Plan:

Execution Plan - SUM and COUNT(DISTINCT)

Test Case 2: SUM and COUNT(DISTINCT) within separate CTEs

	DepartmentKey SMALLINT
	, Dollars MONEY
	, ProductCount INT
-- Calculate the total dollars and the distinct products that were sold
-- separately (CTEs)
; WITH ProductCount AS (
	SELECT DepartmentKey, PD = COUNT(DISTINCT ProductKey)
	FROM FactSalesTimeframeDenormalised AS f
	WHERE TimeFrameKey = 1 AND IsPromo = 1
	GROUP BY DepartmentKey
), TotalDollars AS (
	SELECT DepartmentKey, Dollars = SUM(Dollars)
	FROM FactSalesTimeframeDenormalised AS f
	WHERE TimeFrameKey = 1
	GROUP BY DepartmentKey
SELECT f.DepartmentKey, SUM(Dollars), SUM(pd.PD)
FROM TotalDollars AS f
LEFT JOIN ProductCount AS pd ON f.DepartmentKey = pd.DepartmentKey
GROUP BY f.DepartmentKey

Execution Information:

Execution Information - CTE SUM and COUNT(DISTINCT)

Execution Plan:

Execution Plan - CTE SUM and COUNT(DISTINCT)

Test Case 3: SUM and then UPDATE

CREATE TABLE #TotalsUpdate
	DepartmentKey SMALLINT
	, Dollars MONEY
	, ProductCount INT
-- Get the total dollars
INSERT INTO #TotalsUpdate ( DepartmentKey, Dollars, ProductCount )
SELECT DepartmentKey, SUM(Dollars), 0
FROM FactSalesTimeframeDenormalised AS f
WHERE TimeFrameKey = 1
GROUP BY DepartmentKey
-- Then calculate the product count
; WITH ProductCount AS (
	SELECT DepartmentKey, PD = COUNT(DISTINCT ProductKey)
	FROM FactSalesTimeframeDenormalised AS f
	WHERE TimeFrameKey = 1 AND IsPromo = 1
	GROUP BY DepartmentKey
UPDATE #TotalsUpdate SET ProductCount = PD
FROM ProductCount AS p

Execution Information (2 part operation):

Execution Information - Insert and Update

Execution Plan:


Execution Plan - INSERT


Execution Plan - UPDATE

As you can see, when you are using columnstore indexes, you don’t want to do both a COUNT(DISTINCT) in the same operation as a SUM because then, even though it gives a nice and linear execution plan, the cost of the entire operation is a lot bigger.

So, when you are using columnstore indexes, either use a CTE to split it up or use the insert, then update method for optimal performance ☺

September 1, 2013

Rubik’s Cube Help Videos

Filed under: Fun — Andrew @ 10:40 pm

Today I spent a good few hours trying to figure out how to solve the Rubik’s cube.   After about 4 hours of working on it (with the algorithms written down) and watching videos, I was finally able to successfully complete it.

Of course, I haven’t been able to complete it again … but I’m still trying!

One of the most helpful video playlist is available here.

Hopefully it’ll help you to solve the Rubik’s Cube!

August 31, 2013

Mysql Convenience Configuration File

Filed under: MySQL — Andrew @ 6:28 pm

Today I was playing with the mysql client and I noticed something very interesting.  In the past I had seen a .my.cnf file being generated in my home folder, but I had no idea what it actually did.

The structure of the .my.cnf file:

user = username
password = password

Apparently the mysql client will automatically look in your home directory for this file and if it is present will try to use that to log in.

August 30, 2013

Programmatic Way of Generating a Column List (for inserting or selecting)

Filed under: SQL Server — Andrew @ 8:47 am

When seeding databases for the first time, I sometimes find myself having to manually build a list containing all of the columns in a table. Doing this manually normally leaves me with having to review once or twice just because I include an IDENTITY column and don’t mean to, or I add in a computed column that I didn’t realize existed / was in use.

Because of this, I have written a script to make my life easier. Hopefully it will help you too!

Script available here.

-- ==============================
--  Variables to change
-- ==============================
-- The table name that we are using
DECLARE @TableFullName SYSNAME = '[Production].[Product]'
		-- Should we include computed columns in the CSV?
		, @IncludeComputedColumn BIT = 0
		-- Should we include the identity column in the CSV?
		, @IncludeIdentityColumn BIT = 1
-- Internal Variables
		, @ObjectID INT = OBJECT_ID(@TableFullName)
-- Make sure the object exists
	-- It doesn't, so we are done
	PRINT 'The table, ' + @TableFullName + ' does not exist.'
	-- Grab a full list of columns that we need in CSV
	SELECT @CSVInsertList = ISNULL(@CSVInsertList + ', ', '') + c.name
	JOIN sys.COLUMNS AS c ON t.object_id = c.object_id
	WHERE t.object_id = @ObjectID
		AND ( c.is_computed = 0 OR ( @IncludeComputedColumn = 1 AND c.is_computed = 1 ) )
		AND ( c.is_identity = 0 OR ( @IncludeIdentityColumn = 1 AND c.is_identity = 1 ) )
	-- Check if IDENTITY_INSERT will need to be on (if there is an identity column and we are including it)
	IF(EXISTS(SELECT 1 FROM sys.COLUMNS AS c WHERE c.object_id = @ObjectID AND c.is_identity = 1) AND @IncludeIdentityColumn = 1)
		-- Inform the user
		PRINT 'The table has an IDENTITY column on it.'
	-- Emit the CSV
	PRINT @CSVInsertList

August 29, 2013

Visual Studio – Show Whitespace Characters

Filed under: Visual Studio — Andrew @ 11:19 am

Coming from coding in an environment where the whitespaces matter (i.e. makefiles and python), I became accustomed to showing all of the whitespace characters because I needed a quick and easy way to figure out why these files weren’t working.

In Visual Studio, there is also a setting that will allow you to do that.  There are two different ways to get at it:

  1. File Menus
    1. Edit
    2. Advanced
    3. View White Space
  2. Keyboard Shortcut
    1. CTRL+R, CTRL+W

Once done, you will start to see a “dash” in the middle of the line for spaces or an arrow in the middle of the line for tabs.


This is also super handy for tracking down rogue spaces at the end of a line.

August 28, 2013

Server Check In

Filed under: Maintenance,Server — Andrew @ 6:35 am

I recently found a great tool that will help you not only keep track of, but keep you informed when a server that you use becomes unaccessible.

The tool: https://servercheck.in

I bought the $15 dollar package when I heard about it on LowEndTalk just to see what it did because I was curious and now, I’m glad that I did because it brings so much insight.

What does it do?

Polls your site every 10 (5 if you have the premium account) minutes for one of the following:
- Website – HTTP/HTTPS 200 OK
- Website – HTTP/HTTPS Content
- Server – Basic Ping

If your site is unavailable on the requested method, you immediately get an email (or a text message) saying that your site is down. The application will then continue to monitor your site’s status and let you know as soon as the site has become accessible again. Once the server is back up, you will get one last email saying that your server is back up and how long it was actually down for.

This helps to alleviate the burden of the manual check to see if your site is still up, and will set you up for success if you send in a support ticket.

On the website, you can even see how much downtime was actually experienced over the last 30 days. What does that buy you? The ability to go to your hosting company who (for the most part) say that you will have 99.9% uptime, and give them proof that their SLA has not been met (in my case I currently have a 99.03% uptime for this blog).

If that was all it did, it would be useful however, there is more that it does!

Each time that it hits your site, it keeps track of network latency (time delay in getting to the site) and will report it in a graph which can be useful to see.

Server Check In Example

In short, it’s a great tool that I am glad I ended up buying because it gives great insights.

Check it out!

Older Posts »

Powered by WordPress