Thursday, October 24, 2013

The real problems with technology projects and technologists


I actually started writing this blog a week before Obamacare’s website snafus hit the media, but it further underlined my thoughts on what’s wrong with technology today – or rather, with technology projects and implementations.

I’ve had the privilege and honor to work with some extremely competent technologists over the years. Whether they were Microsoft Premiere teams, HP’s 3rd level engineers, Verizon’s architects, support teams and DBA’s, or VMWare’s Implementation Engineers - these guys know their stuff.
To be as intelligent as we are though, we really know how to screw up a technology project. This has nothing to do with our “hard skills” and everything to do with the lack of “soft” skills.
·         Communication: Be honest with yourself. How well do you communicate? Specifically how well do you listen?
       o   Do you hear in 0’s and 1’s?
       o   Do you take the time to completely listen or are you already formulating your response?
       o   Have you ever sat down and watched a business person work through or on your support product?
     §  You would absolutely gain new respect for the business if you do.
Most techies would rather have their mouse taken away from them than talk with the business. Get over it. Start small by nodding and smiling as you walk by them in the halls.

·         Cross-Team Skills: Do you sit down with technologists from other teams and try to understand the limitations/abilities of the technology they support.
      o   When was the last time a DBA communicated with the application team that the adhoc report they needed put undue stress on the system if ran against production data?
      o   As an App support person, do you let the testing team know you are running “rough draft or dirty” on the newest release?
      o   Do you let a Project Manager know that you are concerned regarding release scope?
      o   Do you as Project Manager let the Business stakeholders know that you can’t meet proposed release dates with the resources available?
At the end of the day, you can only do what you can but if you don’t put yourself out there and talk with the other teams, it’s a great way to fail your team and your company.

·         Narrow versus broad: I’m not talking body style here but vision. What variables does your support area inject into the overall environment?
      o   You may be a great email administrator but do you necessarily understand data management requirements?
      o   Supporting messaging, BYOD admins are feeling the heat from the network guys because the network guys weren’t aware the messaging guys were going to start letting people use their mobile devices at the office.
There are no islands where technology is concerned. Keep in mind that OSI Model.

·         Think like a mortal:  (while I say that tongue-in-cheek) read that as – think like the business/customer/consumer
      o   I am going online to order something – is it easy? Is it obvious where/how I go about my task? Are there help tips to assist me?
           §  Is there an unnatural pause that makes me wonder if something is broken?
           §  Is there feedback that sets the pace for me?
      o   If your app is web-based, do you have sufficient web servers in place to support the potential audience?
           §  Do you consult with the network guys to confirm that the metrics match expectations?
           §  Do you work with the Virtual Environment support team in the event you need additional systems put into the load balancer?

·         Test like a mortal: You cannot test like a techy. We don’t work, think or act in the same manner that someone front-facing does. We don’t even type the same way. Accept it – we are not the business and therefore we need the business to test. Repeat after me: Just because I support the system does not mean I am smarter than or understand the business. Get the business to test.

I think that the traits that make us really good at what we do – also make us guilty of being arrogant – and therefore, unwilling or blind to accept that there are areas we don’t necessarily excel in. So now what? Look at it a little differently. Would you want to go to a doctor who had only received training in the skeletal system? Would you knowingly work in a building designed by an architect who didn’t understand the correlation between temperature change and materials used?  In order to better comprehend the whole, one has to understand not only their working piece of anything but also the impact to the whole.

The problem with technology training is that the focus is too narrow. The courses teach the HOW in the most narrow of circumstances but not the WHEN or WHY. This can be fixed though. Look at COBIT. Look at ITIL. These frameworks encourage communication, broad vision, working closely with the business, working closely with other technology teams. I cannot say enough about how a technologist can grow professionally once they consider the frameworks and their value.

Tuesday, October 1, 2013

No, it's not the network


I would be a rich person if I were able to count the number of times an IT person has claimed that the reason for slowness in an application or system is because of the network. As I type this, I can hear the cheers of network support teams across the world shouting “Hallelujah, somebody said it other than us.” Ok, so maybe not across the world but at least Tampa and Jacksonville.

The network is and always will be an easy target for system issues unless teams are given the tools to “easily” disseminate issues. A number of these tools are free and come with the operating system or database. What we’re really talking about here is training for IT staff. I don’t mean reading the books, studying for a test, passing and then become a “real” engineer. I mean understanding the OSI Model's Seven Layers.

·         Application

·         Presentation

·         Session

·         Transport

·         Network

·         Data Link

·         Physical

It is impossible to completely segregate each layer, so it is completely logical that one layer having issues can bleed into the others. This is really where problem management, thorough root cause analyses and careful incident management comes into play.

I love (not an exaggeration) troubleshooting. I have been asked repeatedly to help teach people how to troubleshoot. While this is possible, far more often, troubleshooting is an art; a skill culled over a course of many years.

·         Start with the obvious. This may sound silly but it’s not uncommon for engineers to “assume” that a system issue must be something complicated and exotic.

o   Has the issue happened before? If so, when? What is the failure frequency? To the same user/system or another?

§  If so, what resolved the issue then?

This is where strong incident management comes into play. Recording issues with resolutions allows for trending to point out repeated issues.

·         Are errors logged? If so, use resources to look up potential resolutions.

Windows, Linux, Unix operating systems ALL record errors. Applications can be configured to write errors to logs. Words of caution:  Massive error writing can cause massive log entries. Verbose logging should only be used for true errors, not application support.

                This is where the beauty of the internet and support agreements come into play. Internet sites such as eventid.net and vendor sites offering searchable databases can make fast work of troubleshooting.

·         Create a team across functional areas to troubleshoot and perform root cause analysis work.

o   Consider triage training for your teams (ITIL and MOF are excellent guidelines to follow)

·         Create a “no blame” environment

·         Track changes made to the environment – this should really be given your highest focus as most issues occur because of changes made to the environment.

o   Was a system having memory issues before the application was updated?

o   Was communication between environments an issue before a firewall change?

·         Track vendor releases for potential issue resolution.

o   My favorite catch phrase is “Trust, but confirm” – I would never have taken that tack in 2001 with Microsoft. Through the years however, Microsoft has heeded the message from customers that we would not accept crappy code any longer.

o   So, test, test and test again but UPDATE.

·         Is it the network? It could be but most likely it’s because of an architecture issue. Insufficient bandwidth was architected for the ever evolving needs of a mobile world. BYOD brings its own issues in that everyone connects – and they connect across the network. Is it the network? No, it’s the increased need for bandwidth.

As always with technology, communication is key. It’s not unheard of for an engineer to have a “back pocket of tricks” to resolve issues. This cannot be accepted by management. These steps, resolutions must be documented in order to make the overall environment a stronger and more successful one. Reward the guys with the bag of tricks but be sure they help those less capable troubleshooters. This is not about job security for an individual; it’s about the strength of the whole.

And, if you haven’t figured it out yet, it’s RARELY JUST about the network.