It was great seeing everyone in our NYC class on SQL Server Performance tuning! As mentioned Jason Horner and I had the opportunity to teach this for Pragmatic Works this week. We had a great time in New York as you might expect.
I’d like to continue learning how to teach and events like this give me the OJT for that skill.
I’m starting to switch over to winter gardening, which – until now – has consisted of bringing key plants in for the long cold winter. Nothing looks better than a bright red geranium in the dead of winter when it’s -20f below outside with 3 feet of snow on the ground. This year we’re trying hydroponics to see if our vegetable yields will be better than the last few years. And it gives us something fun to do!
I’m looking into building a fully monitored ebb and flow system using Raspberry Pi and sensors for light, pH and water. I’m thinking while doing that, we can also do some data gathering on the sensors and store the 15 minute increment feeds to a database.
When we were in Florida I got a chance to visit with my favorite Aunt and got several cuttings from her plants. I’m looking forward to see how her plants do in this system as well.
Found out today that Ralph Kimball is closing down at the end of 2015. Sad news since I’ve taken many of their classes over the years. If you even think learning the Kimball method might help you’d better get over to see Ralph and the crew. They’ve done a lot for this community and will be sorely missed:
I’m prepping to teach a SQL Server Performance Tuning class in New York City next week. These kinds of classes always mean re-reviewing the materials, demos and such. Someone will undoubtedly ask a question I’ve never heard or thought of before too which adds to the materials every time I teach. I truly enjoy going to New York as it’s the starting point for both sides of my family in America.
Thanks to all who attended my session at SQL Saturday #318 in Orlando yesterday! I’ve posted the materials to the SQL Saturday site located here:
Had a great time catching up with everyone at the conference and looking forward to more this year and next.
Often I find myself putting together documentation for T-SQL, C#, or whatever in Microsoft Word 2013. Anyone who’s tried this knows getting Word to show code in a format that looks like a printed book can be tricky.
What I’ve discovered over the years is to create a custom paragraph style. Here are the step-by-step details to do just that:
1) From the Home Ribbon, choose Create a Paragraph Style
2) Give your style a name
3) Change the settings on the font.
4) Change to Paragraph
5) Change a few more settings such as style type (Paragraph) based on No Style. Choose the font and size as well.
7) Change the Line Spacing
8) … and the border
9) Now comes the border. I like a line above and below.
10) And also choose the Shading tab and change the background.
11) Now let’s change the spell check stuff. Don’t need the ugly red squiggle things. You’ll find it under the Language choice
12) Check the box…
There! Now you can easily format your code. I also like to add a custom style for Keywords so that upper-case, bold blue flavor I like is easy to make just right.
Last, I always make a Word Template that contains these customizations so I don’t have to make this more often than I have to.
What’s your favorite trick for documenting with Word?
Thank you to all who attended my online workshop this morning! I had two questions I’d like to address:
Q: Can you use the ODBC Hive solution to insert data into Hadoop?
A: Not with our current technology. Hadoop data is basically read-only. There is no ODBC way to update a row for instance. It can be updated using things like Pig but basically what you’re doing is replacing entries.
Q: Can you use CSV/Storage Spaces?
A: I did some checking and the general answer is no. On the other hand why would you want to? There’s blob storage and plain old UNC file paths.
This fall is shaping up to be a busy one! I’m going to be speaking at SQL Saturday #318 in Orlando on Saturday, Sept 27. My topic is Hadoop + SQL Server: The Emerging Patterns in Big Data
The main web site is:
Please stop by!
This week I’m in Chicago studying Cloudera’s version of Hadoop. I decided to take both the Administrator’s and Developer’s courses to be sure we haven’t missed anything along the way and also to guide the team education process.
So far the materials have been good — aimed at the right audience, clear, consise etc. For Admins, you’ll need a reasonable degree of Linux skills. I started my career in Unix and later adopted Linux so the OS portion has been easy. The really interesting areas have come discovering the deeper choices in Hadoop setups and best practices.
Then comes Cloudera Manager. So far the word is ‘slick’. That’s how I’d describe fairly sophisticated installation and management process works. Of course I knew some of that by my own demos. Cloudera Manager makes setting up large clusters a simple task compared to hand installing all the bits and editing a dozen or more config files for each node.
I’ll be adding more posts on Cloudera and Hadoop in the coming days.
Our son gave me a banjo last January. I play guitar and do my own maintenance. While I idolize Béla Fleck and the Flecktones, Earl Scruggs, Steve Martin and a few others I never saw myself wanting to take up this particular instrument. After some research I found some replacement parts and had it working in no time. Now I can’t seem to put it down! So now I’m an accidental banjoist of sorts.
Along these lines a number of people have self-described as ‘accidental DBAs’ — people who had the duties thrust upon them due to no fault of their own. That wasn’t my own path, but one I understand since so many areas in my career came by way of a need.
Microsoft has a fairly recent article based around on this idea (http://blogs.technet.com/b/accidental_dba/archive/2011/05/23/introducing-the-accidental-dba.aspx) but so many predate this such as the ones over at Simple Talk (www.simple-talk.com) in Jonathan Kehayias, Ted Kreuger and a few more of their folks.
Similar to my banjo fun, the only way to progress beyond the accidental phase is to study, read books, talk with others, try out ideas and purposely work at the craft.
I’ve been a Data Architect, DBA, Data Analyst, ETL guru, Report Hack, code monkey and script jockey, bit twiddler, and whatever else. You don’t always know which thing will be next in your career. Right now I’m working feverishly in Hadoop/HDFS/Hive/Pig. I have a project. It has a scope, a deadline, a promise. That tends to sharpen my focus.
My musical instrument list has included trumpet, ukulele, guitar and now — believe it or not — banjo. And so it is in the data disciplines. Sometimes it seeks you out rather than the other way around.
I’m always amazed when someone in this crazy field hates long hours. I think to myself “If you were a cop would you hate to go to crime scenes? If you were fireman would you hate to go into burning buildings?”. Long hours are part of the gig. But that doesn’t mean you should work around the clock either. That can lead to serious burnout. I’ve seen people constantly checking their smart phone, available 24×7, responding to emails at 3:30 a.m.
Along these lines I’m amused by the recent Yahoo! announcement on no longer allowing their employees the ability to ‘Work From Home’. The whole working from home phenomenon came to my house in 1990. I was doing work remotely as a requirement for a job. I thought ‘I’m not disciplined enough for this! I need the structure of an office, see my coworkers…’. I came to find out the danger wasn’t from not working hard enough, rather it was knowing when to stop. I began working longer and longer hours to the point I was routinely working 12+ hours a day. It wasn’t a year later I found myself looking for a new job. It’s no wonder that today we see productivity skyrocketing; everyone’s so busy working!
Knowing when to quit became an important skill — one that’s overlooked by millions today. Put down that phone, iPad, mouse. Stop! Close the door to the home office and be part of your family.
Making time for your own life and your family is far more important than learning the latest tech trick, language, what have you. These will come and go. There will always be work but the kids will only be little for just a while.
Update: A friend of mine (Dan English, Microsoft MVP, @denglishbi ) notifed me that Lara Rubbelke (@SQLGal also of Microsoft fame) did a similar piece a while back. I’ve seen many of her presentations (not this one) and she’s awesome! Check our her blog posting at:
I’m building a resource list for Hadoop. At first these will be easily found but I hope to grow the list to include more obscure references.
I want to start with Hive because it’s probably one of the most useful pieces of the Hadoop world for experienced data folks. What good is data if you can’t query and analyze it?
Hive DDL Language Manual
I think one of the more useful areas is a quick reference to the language constructs of Hive — which is similar but not exactly like T-SQL or PL/SQL or any other SQL I recall for that matter.
Table Partitioning in Hive
One thing I picked up early on was the way in which we can easily add and delete large amounts of data in Hive. Having done Table Partitioning in a number of RDBMS platforms including SQL Server it’s fairly easy to spot how this works. Basically the trick is to declare the table using the External keyword identifying that the data isn’t directly under Hive’s control but rather external. Then at this point you’re simply describing the shape of the data. Once that’s done adding folders under the location adds data etc. I’m going to work on a special posting next on just this topic.
The Best Book on Hive (so far): Programming Hive
As a third resource I highly recommend the best book on Hive from O’Reilly by Capriolo, Wampler and Rutherglen. It’s a fairly small read with good examples. Given that there aren’t many resources outside of Apache or a vendor site it’s a reasonable attempt to explain Hive.