Decompression Time

As DBAs we work long hours. Many times fixing issues before they become issues to end users. Constantly get thrown under the bus and told how to do things on our servers by uninformed and non-technical weenies people. For time to time we need to decompress I hit the road for Spring Break.

WPPeakHeading to the Slopes

This year similar to the last three we are going skiing in Colorado. We are heading back to Winter Park this year. The last time we were there two years ago our son broke his ankle about two weeks before. Over the entire trip was a really good. Even for him. Since he was not able to ski Winter Park that year we are heading back. The only difference besides the lack of crutches my wife is not making the trip with us. It’s a guy’s trip this year.

The Drive and Weather Factor

The past two years we had a convoy with 2 other families and we drove straight through the night. This year I will be the only driver. I am not planning on driving straight through even though it would be nice to do. If I get tired I will just pull over and take a nap for a while.

I have been watching the past three weekends in a row winter storms have moved through the area we will be driving through. Last week, certain parts of the pan handle of Texas the roads were shut down. If Mother Nature is listening or reading this post, if you could hold off until Sunday evening to drop the snow that would be great, emkay?

 Lunch at the Blue Bonnet 

bluebonnet

When my wife and I lived in Denver we used to eat at the Blue Bonnet Café on a fairly regular basis. The plan is to try and eat lunch there on Saturday on the way through town. I think it will be neat to show our son where we used to “hang out’ so to speak.  Looking at the web site the place has changed a little since we last ate there in the late 80’s early 90’s. It will bring back memories and make new ones I am sure. The sad part, my wife won’t be there to enjoy. There is always next year though.

Heading Over the Pass

No not SQLPASS, Berthoud Pass. The best part of driving to Winter Park and going over the pass is once you get to the top, it’s all downhill to the resort, literally. Hopefully the traffic will be on our side and we will not have any issues like they did yesterday where I-70 was closed for a time with a multi-car pileup.

Conclusion

We all work long stress filed hours each day. It does not matter what you do for a job/career. From time to time you must “get away” from the daily routine and decompress. Weekend trips here can help over the long haul but downtime is a must to keep your mind, body, spirit and drive going. Maybe I will see you on the slopes. If not, I usually take a trip to the beach in the summer, maybe then.

Cheers!

Advertisements

How to get useful information from sys.dm_os_performance_counters

This DMV can be very useful to get information about SQL server. I need to point out a few key points before going too far into how we can use it. While it does return a wealth of information it does not return any counters outside of SQL Server for obvious reasons. The data returned can be a little confusing as well. Without further ado let’s get into it.

sys.dm_os_performance_counter

chart_small

See the Note on the column named cntr_value

For now let’s take a look at the cntr_type information returned and see if we can clarify that a bit. Using the below query to get the counter types.

distinct_cntr_type_query

Now we have our cntr_type but we need to figure out what these numbers mean.

cntr_type

If we look on msdn we find information to help understand…hopefully 🙂

PERF_COUNTER_LARGE_RAWCOUNT: 65792: Decimal: This value does not need calculations is the same as PERF_COUNTER_RAWCOUNT (65536) but is 64-bit for larger numbers values

PERF_COUNTER_BULK_COUNT: 27266576: Decimal: Average number of operations completed during each second of the sample interval. This counter type is the same as the PERF_COUNTER_COUNTER type, but it uses larger fields to accommodate larger values.

PERF_AVERAGE_BULK:

537003264: Decimal:

1073874176: Decimal:

Number of items processed, on average, during an operation. This counter type displays a ratio of the items processed (such as bytes sent) to the number of operations completed, and requires a base property with PERF_AVERAGE_BASE as the counter type.

PERF_LARGE_RAW_BASE: 1073939712: Decimal: This counter type collects the last observed value. It is the same as the PERF_RAW_BASE counter type except that it 64-bit and uses larger fields to accommodate larger values.

query2

Using the query above written by Amit Banerjee twitter | blog

The below image is what we will see in the results. Still not the useful data we are after but we are getting closer.

query2_return

Now on the first image there is a Note we need to be aware of.

For per-second counters, this value is cumulative. The rate value must be calculated by sampling the value at discrete time intervals. The difference between any two successive sample values is equal to the rate for the time interval used.

Keeping that in mind, if we look at the cntr_type 272696576 definition it is an “each second” value. We can run the query below gather the values run it again in say 1 second and determine our value for one second.

query_per_second

As you can see this DMV has some very useful information but at face value can be a little confusing. Only after digging deeper into  the details on the cntr _type and doing additional querying and calculations can we get useful data.

Cheers!

Reference:
sys.dm_os_performance_counter
http://technet.microsoft.com/en-us/library/ms187743(v=sql.105).aspx

WMI Performance Counter Types (Windows)
http://msdn.microsoft.com/en-us/library/windows/desktop/aa394569(v=vs.85).aspx

A lighthearted look at our jobs

I need some help and I am reaching out to the community for some feedback. I am looking at gathering information along the lines of something like this.

If you aren’t checking to see if backups ran last night…

…You might be doing it wrong

If you aren’t reviewing SQL and or Server logs

…You might be doing it wrong

If you aren’t practicing restoring your backups

…You might be doing it wrong

Hopefully you get the idea and what I am trying to compile. Add comments and if I use your input proper attribution will be mentioned, that is, if I already don’t have it myself or someone else has not already provided the feedback.

Now to get things started.

If you don’t leave a comment here…

…You might be doing it wrong

Cheers!

SQL Server Accounts Locked Out

Those dreaded account lockouts. We all have to deal with them from time to time. I created a job to be notified when accounts become locked. That way I am aware of the issue and depending on the account, server and other factors I can deal with it with it in a timely manner.

Problem

When accounts become locked users can’t do their job or in worst case scenarios applications fail to work. I won’t go into details about application accounts locking other than saying it is an application design flaw.

The Code

1

You can see we need a few variables defined.

2

Then comes the formatting for the HTML output, this can be a little cumbersome if you don’t know HTML or want it to be really visually appealing. Personally, I just want the basics and that means looking better than plan old text display. It also provides a little more control over the output.

3

Then creating a temp table and insert the results. The heart of this code is query. Without the use of CAST and CONVERT or you will get an error when trying to insert into the temp table. I insert only for row counts, I then use @@ROWCOUNT to set a flag to send/not send the email notification. I did try and get the row counts from the query within the @tableHTML but always got one extra row even though no accounts were locked.

4

If there are accounts locked, the sp_send_dbmail code is used sending an email. If you look at @emailsubject the server name is inserted after some descriptive wording. This used for identification for the DBA group as well as the help desk when creating the tickets and the project it gets routed to. Makes it much easier to unlock accounts if you know what server they are on.

5

This is an example what the email looks like when received.

Conclusion

If you have unlock user accounts frequently or just have a need to be notified for awareness of accounts being locked this is one way you can do it. Are there others? Sure there are, but this is what I have come up with for now. The code is based off of this link from msdn on sp_send_dbmail http://msdn.microsoft.com/en-us/library/ms190307.aspx

I have added this code to my DBA Toolbox. To see some other snippets look under the DBA Toolbox category.

Cheers!

Dam the Security Risks, Full Speed Ahead

Being a former Navy man I thought a spin-off of the phrase “Dam the torpedoes, full speed ahead” said by Admiral David Glasgow Farragut was appropriate. Changing the phrase to fit my thoughts on how many software companies and developers still to this day don’t consider security when designing or updating application design and code.

Three Strikes and You’re out

clinteastwood

It seems enforcing policies of complex passwords and locking accounts after 3 failed attempts is causing problems for one application in particular.  It was written back at the beginning of the millennium and apparently not updated since.

According to complaints, users are being locked out because the policies are too strict. In addition several service accounts created during installation and needed to run the application are being locked out. One of the accounts is having major issues. Seems it runs a critical part of the application and is used by users when creating ad-hoc reports. If they don’t create the reports correctly and lock the account it brings down a major part of the application. Now that is some fancy coding right there folks.

Vendor Recommendations

They software vendor is requesting we remove the password policy so the users and the service accounts will never lock out. Their reasoning is based on how “it works” in the civilian sector without any password restrictions or lockout polices.

This way the users can attempt to log into the application as many times as needed until they log in with the correct password. Not to mention some of the passwords like “password1” and others similar can continue to work. After all it is a COTS product and any modification will require moths of code changes and tests and could cause major delays in the deployment.

Attitude Adjustment

Normally when those words are spoken by me or someone I call a friend, we are talking about adult beverages being consumed. This particular occasion I am speaking of my attitude and apparently what I must do to remain sane due to all the anti-intelligent thinking going on.

During a conference call the other day I was chatting with one of the managers via instant messenger about the request. Their advice; “Lower your standards”. Now, I really like this manager. But I must respectfully disagree with them on several levels on this one. Why is doing the right thing by enforcing policies for safety, security and just plain common sense so hard?

common-sense3

You can rationalize all you want if that makes you sleep better at night. I on the other hand have a real hard time doing things that just don’t make sense or are changed due to the political nature of the project.

Conclusion

“If doing the right thing is wrong, then I don’t want to be right.” – Unknown Author

The image wording to the left may be little extreme. Perhaps I am over reacting. Or maybe I actually care about some of these things others seemingly don’t care about. Until there is some type of preach or someones data is exposed due to loosening of some of the security safeguards. I don’t know. I think for now, I am just going for my type of attitude adjustment.

Server TCP provider has stopped listening [ 1433 ] due to a failure

I have been seeing this error off and on past month, month and a half. The error was only seen coming from 7 servers and randomly. Another reason it took so long to find and determine the cause was due to trying to collect the errors and investigate what was going on during those times.

Error Reported

DESCRIPTION:    Server TCP provider has stopped listening on port [ 1433 ] due to a failure. Error: 0x2747, state: 2. The server will automatically attempt to reestablish listening.

When researching the error I found several forum and blog posts related to the error message. This one one indicates it may be the server under powered for the workload and it is saturating the buffer space in the processor. To read more follow this link.

http://social.msdn.microsoft.com/Forums/en/sqldataaccess/thread/e52cdda2-dd0b-441f-b997-0d41f0138a55

Here is an excerpt from the thread.

“The errors above are pretty critical. 0x2747 (10055) maps to WSAENOBUFS (An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full.). In general this indicates issues with the paged or non-paged pool in the kernel. The simultaneous IO hick-up is something very bad, too – corruption or data loss may occur if SQL is not able to write to the IO”

Another link below indicates the same information as above in the processor buffers are being saturated or the server is under powered.

http://www.sqlservercentral.com/Forums/Topic1098674-146-1.aspx#bm1099504

Gathering Information

To get more information in an effort to determine if the servers were under powered I created a server sides trace using the Errors and Warning Event Category. This group also has the CPU Threshold Exceeded Event.  Keep in mind if you use the same group you may capture a lot of information. I would suggest you closely monitor the amount of data and the drive space where you are saving the traces too so you don’t fill the drive causing further issues.

Other Things to Consider

While investigating the above error there was an email thread started from within the System Admin group, at some point we (DBAs) were included. The discussion was around the product we use enterprise wide for Host Intrusion / Firewall / Virus Protection. In the email thread it was mentioned trouble with automatic updating of said software and issues being seen by SAs on application servers after the updates. The solution they used was to disable parts of the software.

I then reached out to one of the SAs to get more details. After obtaining the details I logged on to the servers in question and disabled the HIPS service. I documented what was done for future issues and to cover my “you know what”. After a week we have not seen any of the above errors. I will continue to monitor and gather data to make sure we are not having the other issues mentioned but so far I have not seen any.

Conclusion

Keep in mind researching an error, and finding a possible fix may not be the solution for you in your environment. The error mentioned usually indicates issues with the kernel and associated with heavy workloads. This information I was able to gather did not indicate that at all.

Trouble shooting errors is a time consuming process. It took over a month working this issue. I will continue to gather information and monitor the servers in question but for now things appear to be solved.

Cheers!

No Admin Account on Server

We have separate domains where I work. What I mean by that is, the development domain is a sandbox so to speak. Where developers can play all they want, do what they want without causing issues on SQA, Pre-Production or Production servers.

Situation

This past week a project development server was moved out of the development domain and into the main domain. This was all done without my knowledge by system admins. How I found out about it was a developer pinged my in Instant messenger saying he couldn’t log into the server with his old domain username and password.

After asking the build manager if this was in fact the case and receiving confirmation it was and then failing to log into the sever with my super account we had a problem.

Assessment

The VM was moved from one domain to another before I was notified. Had I been notified I could have added an account with elevated privileges or enabled the SA account (we disable by default) in order to add the new domain user accounts. That would have made things much easier. As it was, there were no accounts with permissions to add users. We did have SQL accounts for the application but they were not useful since they have minimal permissions to run the application.

After several different attempts using various things such as different super accounts, IP address and even a little FM nothing worked. I started to search on Google and found this post.

http://dba.stackexchange.com/questions/11299/how-to-add-sysadmin-to-user-in-sql-server-2008-when-no-sysadmin-accounts-exist

It turns out there is a tool that works very nicely and very quickly I might add. PsExec from Windows Sysinternals is the tool you need and can be downloaded from here (http://technet.microsoft.com/en-us/sysinternals/bb897553)

Shawn Melton blog | twitter is the one pointing out the tip in the previously mentioned link. Now following on twitter. Thanks Shawn

psexec

Using the tool is just as simple. Open a command prompt and execute the following command

PsExec  – i – s SSMS.exe

-i runs the program the program so it interacts with the desktop of the specified session on the remote system. If there isn’t a session specified the process runs in the console session.

-s Runs the remote process in the System account.

ssms

SQL Server Management Studio will open with elevated permissions and allow you to add the accounts needed in order to manage the server properly.

login

Conclusion

While the fix for this particular situation was a fast and easy one it does not diminish the fact that things were done by a system administrator with one of my SQL Servers and without my knowledge. Had I been properly looped into the conversation I could have prevented any of this from happening. Hopefully others will learn from this. I have even written this up and posted internally on the SAs best practice list.

Cheers!