ISNUMERIC() ISNOT(ALLBAD)

I recently had to write what could have ended up being a complicated updated. Keep in mind, the usual pitfalls of ISNUMERIC() that get brought up weren’t an issue here. Everything in the columns is either a number, or blank (not NULL, unfortunately, but I had no control over that). So testing for the presence of a number suited my needs fine.

Most interesting for me was summing the results of ISNUMERIC() instead of writing a million different case statements to catch all of the different possible combinations that to build my result set.


UPDATE NPS
SET ORC_SEGMENT = CASE
 WHEN (ISNUMERIC([FY14 Domestic]) = 1 OR ISNUMERIC([FY14 International]) = 1)
 AND (ISNUMERIC([FY14 Ground]) + ISNUMERIC([FY14 Office]) + ISNUMERIC([FY14 Freight]) = 0) THEN 1

 WHEN ISNUMERIC([FY14 Ground]) = 1 AND
 (ISNUMERIC([FY14 Domestic]) + ISNUMERIC([FY14 International]) + ISNUMERIC([FY14 Office]) + ISNUMERIC([FY14 Freight]) = 0) THEN 2

WHEN
 (ISNUMERIC([FY14 Domestic]) + ISNUMERIC([FY14 Freight]) + ISNUMERIC([FY14 International]) + ISNUMERIC([FY14 Ground]) + ISNUMERIC([FY14 Office])) >= 2 THEN 3

WHEN
 (ISNUMERIC([FY14 Domestic]) + ISNUMERIC([FY14 Freight]) + ISNUMERIC([FY14 International]) + ISNUMERIC([FY14 Ground]) + ISNUMERIC([FY14 Office])) = 0 THEN 4
 ELSE NULL END

Deduping based on contact criteria

One of our tracker projects has a sampling requirement to exclude anyone who has been emailed in the last two months, or completed an interview in the last 6 months. This stored procedure quickly does that.


USE [Sample]
GO
/****** Object: StoredProcedure [dbo].[j3688830hist_dupe] Script Date: 12/10/2013 9:09:42 AM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

ALTER procedure [dbo].[j3688830hist_dupe] as

--create table sample.dbo.j3688830_hist (
--id nvarchar(64),
--email nvarchar (128),
--maildate datetime,
--comp int
--)

truncate table sample.dbo.j3688830_hist

insert into sample.dbo.j3688830_hist (
id, email, maildate, comp
)
select
id
, email
, coalesce(orc_maildate4, orc_maildate3, orc_maildate2, orc_maildate1)
, case when queue = 'COMPLETED' then 1 else 0 end
from [SERVER].j3688830.dbo.sample_web

update s1
set s1.statusflag = case
 when s1.email = s2.email and s2.comp = 0 and datediff(mm, s2.maildate, getdate()) <= 2 then 13
 when s1.email = s2.email and s2.comp = 1 and datediff(mm, s2.maildate, getdate()) <= 6 then 14
 else statusflag
 end
from j3688830 s1,
 j3688830_hist s2
where s1.statusflag = 0

EAV of Destruction

My company uses these horrible EAV tables to store data generated by SPSS Dimensions data collection. It’s a pain because information is stored across multiple tables, and querying to view it horizontally is difficult. These are a work in progress, but a good starting point was putting together a couple dynamic pivot tables.

I use CTEs to alias the variable numbers as their text values, and a regular select to generate the list of pivot column values. ¬†What I end up with is a large table of horizontal data. It’s not the most performant thing in the world, and still needs some polishing, but it’s getting there. I’m going to drop using the temp tables and just query the CTEs eventually.

</pre></pre>
declare @rcols1 nvarchar(MAX),
 @rcols2 nvarchar(MAX),
 @rquery1 nvarchar(MAX),
 @ocols1 nvarchar(MAX),
 @ocols2 nvarchar(MAX),
 @oquery1 nvarchar(MAX),
 @bigquery nvarchar(MAX)

;with repaggr1(vpath) as (
select distinct quotename(r.variableid) + ' AS ' + quotename(v.variablename)
from Responses2 r join Variables v
on r.VariableID = v.VariableID
and v.Type in (1,3)
)
SELECT @rcols1 = STUFF(
 (SELECT distinct ',' + c.vpath
 FROM repaggr1 c
 ORDER BY 1
 FOR XML PATH(''), TYPE
 ).value('.', 'NVARCHAR(MAX)')
 ,1,1,'')

;with repaggr2(vpath) as (
select distinct r.variableid
from Responses2 r join Variables v
on r.VariableID = v.VariableID
and v.Type in (1,3)
)
SELECT @rcols2 = STUFF(
 (SELECT distinct ',' + QUOTENAME(cast(c.vpath as varchar(20)))
 FROM repaggr2 c
 ORDER BY 1
 FOR XML PATH(''), TYPE
 ).value('.', 'NVARCHAR(MAX)')
 ,1,1,'')

;with repaggo1(vpath) as (
select distinct quotename(r.variableid) + ' AS ' + quotename(v.variablename)
from OtherData2 r join Variables v
on r.VariableID = v.VariableID
and v.Type = 2
)
SELECT @ocols1 = STUFF(
 (SELECT distinct ',' + c.vpath
 FROM repaggo1 c
 ORDER BY 1
 FOR XML PATH(''), TYPE
 ).value('.', 'NVARCHAR(MAX)')
 ,1,1,'')

;with repaggo2(vpath) as (
select distinct r.variableid
from OtherData2 r join Variables v
on r.VariableID = v.VariableID
and v.Type in (2, 5)
)
SELECT @ocols2 = STUFF(
 (SELECT distinct ',' + QUOTENAME(cast(c.vpath as varchar(20)))
 FROM repaggo2 c
 ORDER BY 1
 FOR XML PATH(''), TYPE
 ).value('.', 'NVARCHAR(MAX)')
 ,1,1,'')

SELECT @rquery1 = 'WITH R1 AS (SELECT serial, ' + @rcols1
SELECT @rquery1 = @rquery1 + char(10) + ' from
 (
 select serial
 , response
 , variableid
 from responses2
 ) x
 pivot
 (
 max(response)
 for variableid in ('
SELECT @rquery1 = @rquery1 + char(10) + @rcols2 + char(10) + ') ' + char(10) + ')p )' + char(10)

SELECT @oquery1 = ', O1 AS (SELECT serial, ' + @ocols1
SELECT @oquery1 = @oquery1 + char(10) + ' from
 (
 select serial
 , textval
 , variableid
 from otherdata2
 ) x
 pivot
 (
 max(textval)
 for variableid in ('
SELECT @oquery1 = @oquery1 + char(10) + @ocols2 + char(10) + ') ' + char(10) + ') p )'

SELECT @bigquery = @rquery1 + @oquery1 +
 char(10) +
 'select o.*, r.*
 from O1 o join R1 r
 on o.serial = r.serial
 and o.serial > 0
 and r.serial > 0
 where r.comp = 6
 order by cast(r.serial as int)'

exec (@bigquery)

&nbsp;
<pre>

BCP is your friend

I will pretty often get half a dozen or so text files, pulled by some Grand Inquisitor who hates delimiting data. I do get a layout file, though, which at least allows me to faux-automate some of the process. Rather than merging all the text files, dealing with the per-file headers, and then using text to column in Excel to put the data in readable format, I’ve started doing the following:

The batches come with a layout file that gives column lengths and starting positions of the data. So I use that to make a table, like this:


--DROP TABLE SAMPLE.DBO.J3689116_BCP

CREATE TABLE SAMPLE.DBO.J3689116_BCP (
[DUNS] [VARCHAR] (9),
[BusinessName] [VARCHAR] (90),
[Tradestyle] [VARCHAR] (90),
[Address1physical] [VARCHAR] (64),
[Address2physical] [VARCHAR] (64),
[CityName] [VARCHAR] (30),
[PostalCode] [VARCHAR] (9),
[StateProvince] [VARCHAR] (30),
[CountryName] [VARCHAR] (20),
[CountryAccessCode] [VARCHAR] (4),
[Telephone] [VARCHAR] (16),
[CEOName] [VARCHAR] (60),
[CEOTitle] [VARCHAR] (60),
[US1987SIC1L2] [VARCHAR] (2),
[US1987SIC1R2] [VARCHAR] (2),
[US1987SIC2] [VARCHAR] (4),
[US1987SIC3] [VARCHAR] (4),
[US1987SIC4] [VARCHAR] (4),
[US1987SIC5] [VARCHAR] (4),
[LineofBusiness] [VARCHAR] (41),
[StatusCode] [VARCHAR] (1),
[SubsidiaryCode] [VARCHAR] (1),
[AnnualSales] [VARCHAR] (18),
[AnnualSalesUSD] [VARCHAR] (15),
[EmployeesTotal] [VARCHAR] (7)
)

--truncate table SAMPLE.DBO.J3689116_BCP

Which allows me to write a format file:


EXEC XP_CMDSHELL 'bcp SAMPLE.DBO.J3689116_BCP format nul -c -t -f "\\path\J3689116.dat" -S SERVERNAME -T'

Which allows me to import each file using this command. I just have to replace the file name.


EXEC XP_CMDSHELL 'bcp SAMPLE.DBO.J3689116_BCP in "\\path\Canada1_99 Sample_Out.smp" -f \\path\J3689116.dat -S SERVERNAME -T'

Doing some file processing with Power Shell

This code does a few things.

First, it copies a text file from each subdirectory into on main directory

Second, it merges them all into one combined file.

Finally, it performs some cleaning, removing unnecessary strings, and empty lines. Most importantly, since each of the files has a set of headers, and we only need one, I use Set-Content to write the file with one set of headers at the top, and then the cleaned data by storing each in a variable.


$files = ls "\\path\S*" -recurse -Include *.txt</pre>
foreach-object {
copy $_ "\\path\Erik"
}

--

Get-Content C:\erik\*.txt | Out-File C:\erik\whatever.txt -Encoding Unicode -Width 128

--

$b = "header information to merge back in"
$a = (Get-Content C:\erik\whatever.txt) |
Foreach-Object {$_ -replace "duplicate header information", ""}|
Foreach-Object {$_ -replace "bad character", ""}|
? {$_.trim() -ne "" }
Set-Content C:\erik\whatevercleaned.txt -value $b, $a
<pre>

Querying SQL with Power Shell

This code will create a .csv file of a table on your user desktop. I have to send table information quite a bit, and it’s easier to manipulate this code than to query/copy/paste in my current db set up, where each project gets its own db on a live server. It also will put it on the desktop of the current user. I added the -notypeinformation argument to keep the file from starting with non-table information.

</pre>
$SqlQuery = "select * from TABLE"
$SqlConnection = New-Object System.Data.SqlClient.SqlConnection
$SqlConnection.ConnectionString = "Server=SRVRNM; Database=DBNAME; Trusted_Connection = yes"
$SqlCmd = New-Object System.Data.SqlClient.SqlCommand
$SqlCmd.CommandText = $SqlQuery
$SqlCmd.Connection = $SqlConnection
$SqlAdapter = New-Object System.Data.SqlClient.SqlDataAdapter
$SqlAdapter.SelectCommand = $SqlCmd
$DataSet = New-Object System.Data.DataSet
$nSet = $SqlAdapter.Fill($DataSet)
$OutputTable = $DataSet.Tables[0]
$SqlConnection.Close();
$OutputTable | EXPORT-CSV "C:\Users\$env:username\Desktop\J3689068_FRESH.CSV" -force -notypeinformation
<pre>

Using t-sql to call curl and download a file

This was a pain because the URI has a redirect in it, but once I figured out the directory structure of the website I was able to work around it.


USE [Sample]
GO
/****** Object: StoredProcedure [dbo].[Wireless_Blocks_File] Script Date: 12/9/2013 12:47:34 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

ALTER procedure [dbo].[Wireless_Blocks_File]
as

declare @month varchar(3)
declare @cmd1 varchar(4000)
declare @cmd2 varchar(4000)
declare @cmd3 varchar(4000)

select @month = upper(DATENAME(mm,getdate()))

select @cmd1 = 'EXEC master.sys.xp_cmdshell ' + '''echo D:\path\curl.exe -k -c cookie.txt -d nusername=USER -d password=PASS http://URI/cookie.php -s > D:\path\Wireless_Blocks_SQLtest.bat'''
print(@cmd1)
exec (@cmd1)

select @cmd2 = 'EXEC master.sys.xp_cmdshell ' + '''echo D:\path\curl.exe -b cookie.txt http://URI/WIRE-FULL-[month]13.TXT.zip -O -s >> D:\path\Wireless_Blocks_SQLtest.bat'''
set @cmd2 = replace(@cmd2, '[month]', @month)
print (@cmd2)
exec(@cmd2)
select @cmd3 = 'EXEC master.sys.xp_cmdshell ' + '''D:\path\Wireless_Blocks_SQLtest.bat'''

print (@cmd3)

exec (@cmd3)

Generating inserts dynamically

There are some caveats to this code. It was written rather specifically for my work’s table structure, in that our live tables are almost exact duplicates of our staging tables, except our staging tables carry an additional column for cleaning data. I wrote this a while back and haven’t had a chance lately to refine it.


USE [Sample]
GO
/****** Object: StoredProcedure [dbo].[sampleinsert_noflag] Script Date: 12/9/2013 12:34:24 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

ALTER procedure [dbo].[sampleinsert_noflag]
--EXECUTE sampleinsert_noflag 'from table', 'to table', 'where orc_reps = 1'
@tablefrom varchar(512), --table to copy data from
@tableto varchar(512), --table to copy data to
@query varchar(512) = '' --additional query to filter select by
as

declare @cols nvarchar(max)
declare @insert nvarchar(max)
declare @sql nvarchar(max)

select @cols = stuff(
 (
 SELECT ', '+ quotename(COLUMN_NAME)
 FROM INFORMATION_SCHEMA.COLUMNS
 WHERE TABLE_NAME = @tablefrom
 AND COLUMN_NAME <> 'STATUSFLAG'
 FOR XML PATH(''), type
 ).value('.', 'nvarchar(max)'
 ),1,1,'')

select @insert = 'insert into ' + @tableto +
 ' (' + char(10) + @cols + char(10) + ')
select ' + char(10) + @cols + char(10) + 'from '
+ @tablefrom + ' ' + @query

exec (@insert)
print (@insert)

Email validation

I wrote this to catch email addresses with problems in them. Sometimes they’re correctable, other times people supply intentionally invalid data to ¬†avoid spam.


USE [Sample]
GO
/****** Object: StoredProcedure [dbo].[ErikEmailExcludes] Script Date: 12/9/2013 12:24:01 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

&nbsp;

ALTER PROC [dbo].[ErikEmailExcludes]
 @projectID int,
 @email varchar(64)= 'orc_email'
as
DECLARE @ProcessTable varchar(64)
select @ProcessTable=ds_ProcessTable from sample.dbo.ds_dataset where ds_ProjectID=@ProjectID
declare @SQLString varchar(max)

-- Remove leading and trailing spaces from email first
set @sqlString='UPDATE [table]
 SET [email] = ltrim(rtrim([email]))
 FROM [table]'
SET @SQLString=REPLACE(@SQLString,'[table]',@ProcessTable)
SET @SQLString=REPLACE(@SQLString,'[email]',@email)
exec (@SQLString)

--Flag Invalid Emails
set @sqlString='UPDATE [table]
 SET statusflag = case
 when len([email]) < 5 or [email] is null then 101
 when [email] not like ''[A-Za-z0-9_-.'''']%@[A-Za-z0-9_-]%.[A-Za-z]%[A-Za-z]'' then 106
 when charindex(''@'', [email]) is null then 108
 when charindex(''.'', [email]) is null then 103
 when charindex(''_'', [email]) > charindex(''@'', [email]) then 105
 when patindex (''%[ &,":;!+=\/()<>?]%'',[email]) > 0 then 106
 when [email] like ''%.@%'' or [email] like ''%@.%'' then 109
 when [email] like ''%@%@%'' then 102
 when [email] like ''%..%'' then 110
 when len(substring([email], len([email]) - charindex(''.'', reverse([email])) + 2, len([email]))) = 3
 and substring([email], len([email]) - charindex(''.'', reverse([email])) + 2, len([email]))
 not in (''biz'',''cat'',''com'',''int'',''net'',''org'',''pro'',''tel'',''xxx'',''edu'',''gov'',''mil'') then 111
 when len(substring([email], len([email]) - charindex(''.'', reverse([email])) + 2, len([email]))) > 3
 and substring([email], len([email]) - charindex(''.'', reverse([email])) + 2, len([email]))
 not in (''aero'',''asia'',''coop'',''info'',''jobs'',''mobi'',''museum'',''name'',''travel'') then 111
 when [email] like ''no@%''
 or [email] like ''none%@%''
 or [email] like ''non@%''
 or [email] like ''nope@%''
 or [email] like ''noe@%''
 or [email] like ''nop@%''
 or [email] like ''nonone%@%''
 or [email] like ''noemail%@%''
 or [email] like ''no%mail@%''
 or [email] like ''no%comp%@%''
 or [email] like ''noeam%@%''
 or [email] like ''noem%@%''
 or [email] like ''nomeail%@%''
 or [email] like ''noname%@%''
 or [email] like ''noone@%''
 or [email] like ''noprime@%''
 or [email] like ''no%reply%@%''
 or [email] like ''noreply%@%''
 or [email] like ''novalid%@%''
 or [email] like ''notvalid%@%''
 or [email] like ''nowhere@%''
 or [email] like ''unk%''
 or [email] like ''abc%''
 or [email] like ''123%''
 or [email] like ''[0-9]@%''
 or [email] like ''[0-9][0-9]@%''
 or [email] like ''[0-9][0-9][0-9]@%''
 or [email] like ''[0-9][0-9][0-9][0-9]@%''
 or [email] like ''[0-9][0-9][0-9][0-9][0-9]@%''
 or [email] like ''[0-9][0-9][0-9][0-9][0-9][0-9]@%''
 or [email] like ''[0-9][0-9][0-9][0-9][0-9][0-9][0-9]@%''
 or [email] like ''[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]@%''
 or [email] like ''[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]@%''
 or [email] like ''[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]@%''
 or [email] like ''%@no.%''
 or [email] like ''%@non.%''
 or [email] like ''%@noe.%''
 or [email] like ''%@none.%''
 or [email] like ''%@nope.%''
 or [email] like ''%@XX%''
 then 113
 when [email] like replicate(substring([email],1,1),5) + ''%'' then 113
 else statusflag
 end
 where statusflag = 0'
SET @SQLString=REPLACE(@SQLString,'[table]',@ProcessTable)
SET @SQLString=REPLACE(@SQLString,'[email]',@email)
exec (@SQLString)

Deduping

This stored procedure will take up to three variables to dedupe on. So, if you want to keep duplicate phone numbers as long as the data contains different names, this can handle it. For instance, two parents, one land line. Or multiple contacts at one main business line, but they have different physical addresses.

</pre>
USE [Sample]
GO
/****** Object: StoredProcedure [dbo].[ErikDeDupe] Script Date: 12/9/2013 12:20:24 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

&nbsp;

ALTER PROC [dbo].[ErikDeDupe]
 @dupenum int = 1,
 @projectID int,
 @dupVar1 nvarchar(20) =N'',
 @dupVar2 nvarchar(20) =N'',
 @dupVar3 nvarchar(20) =N''
as

DECLARE @ProcessTable varchar(64)
select @ProcessTable=ds_ProcessTable from sample.dbo.ds_dataset where ds_ProjectID=@ProjectID
declare @SQLString nvarchar(1000)

if @dupVar2 = N''
begin
set @sqlString=N';with dedupe (id, statusflag, Rn) as (
 select id, statusflag, row_number() over(partition by [var1] order by id) as Rn
 from [table]
 )
 update dedupe
 set statusflag = 10
 where rn > [dupenum]
 and statusflag = 0'

SET @SQLString=REPLACE(@SQLString,'[var1]',@dupVar1)
SET @SQLString=REPLACE(@SQLString,'[table]',@ProcessTable)
SET @SQLString=REPLACE(@SQLString,'[dupenum]',@dupenum)
end

if @dupvar2 <> N'' and @dupVar3 = N''
 begin
 set @sqlString=N';with dedupe (id, statusflag, Rn) as (
 select id, statusflag, row_number() over(partition by [var1], [var2] order by id) as Rn
 from [table]
 )
 update dedupe
 set statusflag = 10
 where rn > [dupenum]
 and statusflag = 0'

SET @SQLString=REPLACE(@SQLString,'[dupenum]',@dupenum)
 SET @SQLString=REPLACE(@SQLString,'[var1]',@dupVar1)
 SET @SQLString=REPLACE(@SQLString,'[var2]',@dupVar2)
 SET @SQLString=REPLACE(@SQLString,'[table]',@ProcessTable)
 end

if @dupvar2 <> N'' and @dupvar3 <> N''
 begin
 SET @SQLString=N';with dedupe (id, statusflag, Rn) as (
 select id, statusflag, row_number() over(partition by [var1], [var2], [var3] order by id) as Rn
 from [table]
 )
 update dedupe
 set statusflag = 10
 where rn > [dupenum]
 and statusflag = 0'

SET @SQLString=REPLACE(@SQLString,'[dupenum]',@dupenum)
 SET @SQLString=REPLACE(@SQLString,'[var1]',@dupVar1)
 SET @SQLString=REPLACE(@SQLString,'[var2]',@dupVar2)
 SET @SQLString=REPLACE(@SQLString,'[var3]',@dupVar3)
 SET @SQLString=REPLACE(@SQLString,'[table]',@ProcessTable)
 end

exec (@SQLString)
print(@SQLString)
<pre>