Welcome to Project Orange: Facsimile Copies of Your Database!

Project Orange

Project Orange began long ago as part of a way to have copies of databases for development. Originally it was named for the DoD Orange handbook on computer security.

Using facsimile copies of databases ensure data isn’t accidentally leaked, lost or stolen. This helps developers avoid those embarrassing times on 60 Minutes explaining how they lost millions of their customer’s data. To that end I’m going to do a series on Project Orange in this blog.

Project Orange has evolved over the years from it’s beginnings in dBase III to the current version running SQL Server 2014. Oracle 8.x, PostGreSQL and MySQL were all once platforms for this approach. Today I’m building it solely for SQL Server due to time constraints. It’s likely it will one day be added to the Open Source community.

We’re going to begin this project using PowerShell as the basis. PowerShell can utilize SMO among many other ways it can interact with SQL Server

 

 

Data Modeling Tools Survey

Dome School 2010

Twelve years ago I did an intensive survey of text editors. A big part of what we data folks do involves writing scripts in plain text documents. Having tried no less than 10 products I chose UltraEdit – still my favorite today because of it’s fabulous block editing feature. And it does so much more.

SlickEdit was the most expensive ($300) and had a most interesting interface. It’s mainly  for C++ coders so a lot of features weren’t all that important to me.  This was followed by TextPad and whatever else was around back then.

Of course when I mention the block editing thing people always point out to me I could’ve done that in xxxx (TextPad for example). and always I find that it’s no where near the same quality or experience. Yep — UltraEdit’s still the one.

Now I’m embarking on another quest — finding my daily go-to data modeler. It used to be Visio until Microsoft killed off Reverse Engineering as an option starting in 2013. Visio has never been a great tool — but good enough. And I’ll still use it for inconsequential/trivial tasks such as simple flowcharts.

I realize I could be working in Visual Studio as well, but for me, that’s mainly for coding in say – C# — of which I’ve done plenty. The experience is geared toward that activity not data. In fact, the data practices seem to be firmly in the back seat when it comes to Microsoft and the tool set. No matter – there are plenty of fish in the sea.

For years I’ve been able to design data models in products like ER/Studio from Embarcadero or CA’s ERWin. Along the way I got a chance to work with Sybase’s PowerDesigner and had Sparx’s offering on another gig. All good, capable tools for the job at hand. But these tools are just too expensive for me to own.

For the foreseeable future I’m going to be reviewing new tools meant for data professionals. If something isn’t found I’ll continue the search. This week, I’m going to look at Open ModelSphere:

http://www.modelsphere.org

It seems to have a lot of capability and might make a good alternative choice.

 

 

 

 

SQL Server 2014: The Future is Now!

Futureman

Futureman – Béla Fleck and the Flecktones

I’ve been working with SQL Server since 1996. As it happened, I started with SQL Server 6.0 on a month-long engagement for a point of sale project. Immediately following, I worked with one of the very early SQL Server versions – 4.2 — for a salt company’s inventory reporting system. That gig lasted only a few weeks but I learned to code stored procedures and got a good overall foundation in client/server architecture. Later that year I was invited by some Microsoft folks to the release party for the Back Office Suite in Las Vegas. So began a love affair that’s lasted nearly 20 years.

Prior to SQL Server, I’d been working on the Unix side of things: Postgres, Progress, Ingres and lots of xBase code (dBase, FoxBase, FoxPro, Clipper et al).  Just like everyone else I had to make a choice. The choice was either the upcoming SQL Server product and client/server or stay with what I knew.

At that time I believed the sun was setting on many Unix products, as was the old xBase languages. I figured the future belonged to distributed computing and good systems would be based on server-based data systems.  I believed then (as I do now) that the future belongs to us (and Futureman!).

Back then I worked for small to medium sized businesses that couldn’t afford Oracle, Sun, IBM or anything in the bigger sense of data. To be sure, some did have multiple platforms and on occasion I’d get a crack at wring PL/SQL and DB/2 and seeing how the other side lived — and I’m very thankful for those opportunities.

In 2011 and 2012 I focused on Hadoop and how to extend SQL Server using Cloudera’s implementation of the popular Open Source platform. The problem I was solving was finding the right way to create a data lake and leverage it from SQL Server. Mainly this was to get around the need for Federation for scale out deployments on very large systems. Federation definitely is still an answer but it’s woefully complex, expensive and not always a well performing approach.

What I love about Hadoop is the built-in toughness on cheap hardware. For the most part it’s easy to manage and in many circumstances – blindingly fast. It’s also a small return to my Unix roots which is fun! I’ve also explored HBase, Cassandra and CouchDB as well.

In the coming weeks I’m going dig into cool, new features of SQL Server 2014 such as:

  • In Memory OLTP
  • In Memory Data Warehouse
  • Updateable Column Store Indexes
  • AlwaysOn integration with Azure

At some point this year I’d also like to continue my SQL Server + Hadoop proposition.

Have a wonderful new year!

Rowland

 

 

New York SQL Server Performance Tuning Class

14912031623_a29d240797_o

It was great seeing everyone in our NYC class on SQL Server Performance tuning!  As mentioned Jason Horner and I had the opportunity to teach this for Pragmatic Works this week. We had a great time in New York as you might expect.

I’d like to continue learning how to teach and events like this give me the OJT for that skill.

Thanks Everyone!

 

 

Digging Deeper

15287339997_35ffb7044e_o

I’m starting to switch over to winter gardening, which – until now – has consisted of bringing key plants in for the long cold winter. Nothing looks better than a bright red geranium in the dead of winter when it’s -20f below outside with 3 feet of snow on the ground. This year we’re trying hydroponics to see if our vegetable yields will be better than the last few years. And it gives us something fun to do!

I’m looking into building a fully monitored ebb and flow system using Raspberry Pi and sensors for light, pH and water. I’m thinking while doing that, we can also do some data gathering on the sensors and store the 15 minute increment feeds to a database.

When we were in Florida I got a chance to visit with my favorite Aunt and got several cuttings from her plants. I’m looking forward to see how her plants do in this system as well.

Found out today that Ralph Kimball is closing down at the end of 2015. Sad news since I’ve taken many of their classes over the years. If you even think learning the Kimball method might help you’d better get over to see Ralph and the crew. They’ve done a lot for this community and will be sorely missed:

http://www.kimballgroup.com/

I’m prepping to teach a SQL Server Performance Tuning class in New York City next week. These kinds of classes always mean re-reviewing the materials, demos and such. Someone will undoubtedly ask a question I’ve never heard or thought of before too which adds to the materials every time I teach. I truly enjoy going to New York as it’s the starting point for both sides of my family in America.

 

Documenting Code with Microsoft Word

Often I find myself putting together documentation for T-SQL, C#, or whatever in Microsoft Word 2013. Anyone who’s tried this knows getting Word to show code in a format that looks like a printed book can be tricky.

What I’ve discovered over the years is to create a custom paragraph style.  Here are the step-by-step details to do just that:

1) From the Home Ribbon, choose Create a Paragraph Style

2) Give your style a name

3) Change the settings on the font.

4) Change to Paragraph

5) Change a few more settings such as style type (Paragraph) based on No Style. Choose the font and size as well.

6) 

 

7)  Change the Line Spacing

8) … and the border

9) Now comes the border. I like a line above and below.

10) And also choose the Shading tab and change the background.

11) Now let’s change the spell check stuff. Don’t need the ugly red squiggle things. You’ll find it under the Language choice

12) Check the box…

There! Now you can easily format your code. I also like to add a custom style for Keywords so that upper-case, bold blue flavor I like is easy to make just right.

Last, I always make a Word Template that contains these customizations so I don’t have to make this more often than I have to.

What’s your favorite trick for documenting with Word?

 

 

 

 

 

The Hadoop Buzz

14762808503_37ab02bce1_z

Thank you to all who attended my online workshop this morning! I had two questions I’d like to address:

Q: Can you use the ODBC Hive solution to insert data into Hadoop?

A: Not with our current technology. Hadoop data is basically read-only. There is no ODBC way to update a row for instance. It can be updated using things like Pig but basically what you’re doing is replacing entries.

Q: Can you use CSV/Storage Spaces?

A: I did some checking and the general answer is no. On the other hand why would you want to? There’s blob storage and plain old UNC file paths.

Thanks everyone!

 

 

 

Hadoop, Cloudera and Chicago!

20130603_182924

This week I’m in Chicago studying Cloudera’s version of Hadoop. I decided to take both the Administrator’s and Developer’s courses to be sure we haven’t missed anything along the way and also to guide the team education process.

So far the materials have been good — aimed at the right audience, clear, consise etc.  For Admins, you’ll need a reasonable degree of Linux skills. I started my career in Unix and later adopted Linux so the OS portion has been easy.  The really interesting areas have come discovering the deeper choices in Hadoop setups and best practices.

Then comes Cloudera Manager. So far the word is ‘slick’. That’s how I’d describe fairly sophisticated installation and management process works. Of course I knew some of that by my own demos. Cloudera Manager makes setting up large clusters a simple task compared to hand installing all the bits and editing a dozen or more config files for each node.

I’ll be adding more posts on Cloudera and Hadoop in the coming days.