How to Improve Reliability

Thursday, February 1, 2018

google-site-verification: googlec49f333c34bef1ac.html

Monday, May 24, 2010

How Do I Implement PSM

I got this email from a friend and thought my response my be worth a blog:

Ok so I may or may not be working for a very small local chemical plant as I haven't made my decision yet. So my question to you is this.. if you owned a small old plant that you wanted to bring up to PSM compliance what would you're To Do list look like? I'm thinking that if it were me them my list would include stuff like:

-Bring in contractor to clean up my equipment records, pump;ids etc (i.e. similar to the group we had in Company X)
-Get some software in place to input inspection data (i.e. I'm thinking of looking into maybe Lloyd's Register Capstone's RBMI software or ...)
- look into software for managing MOCs
- Set up a Reliability program (note. not sure where to start with this one other than maybe a simple spread sheet for now)

If you could throw some ideas my way it would be greatly appreciated so that I can get a better feel for what I might be getting myself into.

Here is my Answer:

First read the PSM law, its a short read and very understandable (OSHA
1910.119). I have condensed it in the powerpoint for simplicity, you
need to read the real thing. It is here:

http://www.osha.gov/pls/oshaweb/owadisp.show_document?p_table=STANDARDS&p_id=9760

Next clarify your scope with the company. Are you doing all of PSM or
just "Mechanical Integrity"? All of PSM includes Training, Operating
Procedures, Contractors, PHA, Mechanical Integrity.

Use the psm assessment.doc document to do your own assessment
of what they are doing today and identify gaps. As much as you can
document what they are doing that meet the requirements you can save a
lot of time and effort.

Identify the units to be included by having your Chemical Engineers use Appendix A to see what units are qualified. Exclude all units from your PSM procedures that are not required (sulfur plants, utilities, etc)

For each element of the standard, this is what I would do:

1910.119(c) Employee Participation
Write a document which describes what information is available to employees and how they access it.

1910.119(d) Process Safety Information
* Identify what you have and what is missing or out of date
* Develop plans to get what is missing and make sure what you have is up to date. (Hire some contractors who know what they are doing. I can recommend some if need be.)
* For some things like P&ID updates, have your own people do it. Have your operators walk them all down and mark them up. Have a contractor make the changes and draw them up. If you have nothing, bring in the laser scanning guys and have them scan the whole unit and give you drawings and 3D models - it will be cheaper and easy to update but very expensive up front.

1910.119 (e) Process Hazard Analysis
* Put together a schedule to do PHA's and bring in a contractor to start running them. Schedule them so they stretch out over 5 years because they need to be done every 5 years and you don't want to do them all at once every 5 years.

1910.119 (f) Operating Procedures
* Decide on a format and have your operators start writing down everything they do on a routine basis. Consider somebody like RWD to come in and do this for you if your people can't do it. If your operators can read and write it's much cheaper if they do it.

1910.119 (g) Training
* As you get Procedures done. Conduct Operator training on the new procedures and document.
* Specifically identify what training you consider "required" and how often it needs to be done. Keep it to the minimum that meets the requirements.

1910.119 (h) Contractors
* Put together a simple audit and audit your contractors. Ask them to close gaps and re-audit them when they are done. Make sure you follow up and document.

1910.119 (i) Pre Startup Safety Review
* Write a procedure for walking down, HAZOP-ing, and verifying as-built for any change to P&ID and any new installation. Include a checklist with sign off.

1910.119 (j) Mechanical Integrity
* Specifically applies to Pressure vessels and storage tanks; Piping systems (including piping components such as valves); Relief and vent systems and devices; Emergency shutdown systems; Controls (including monitoring devices and sensors, alarms, and interlocks) and, Pumps.

You need inspections, thickness calcs and corrosion rates and next inspection dates for pressure vessels tanks and piping. Calcs and test results and test intervals for relief valves. Regular system tests for emergency shutdown systems and controls (make sure only those that are critical to controlling the process and alarms/shutdowns are counted, indication only instruments need not be included). Pumps is pretty vague - OSHA really only looks for overspeed trip testing on turbine driven pumps. Put in a PM plan on other pumps and you are covered.

* You need written procedures that cover your mechanical integrity program (what gets done when by whom)

1910.119 (k) Hot Work Permit
* If not already covered, go work someplace else. This facility is not safe! Make sure the Hot Work process meets all the PSM requirements. Modify if not.

1910.119 (l) Management of Change
* Define what a "change" is and write a procedure on what must be done when something is changed. Turn what the law says into a procedure for the site. Put together a sign off process to make sure
requirements are met. Buy and MOC software and implement at site. Write your process around the software workflow.

1910.119 (m) Incident Investigation
* Define what an incident is, who will do an investigation and standardize on a process. I recommend Kepner-Tregoe "Problem Analysis"

1910.119 (n) Emergency Response
* Write an Emergency Response Plan. I can probably get a copy of one as an example if you need one.

1910.119 (o) Audits
* Set up a PSM Audit process and a regular audit frequency (the attached PSM Assessment doc is it). Start self-auditing and get a third party to audit you every 3 - 5 years. You may want to do this up front tocover your butt.

1910.119 (p) Trade Secrets
* Nothing to do here. It says you can't use "Trade Secrets" as an excuse to not communicate hazards to your employees.

Good luck and have fun with this. It's a 5 year job.

Friday, April 30, 2010

Why Are There Few Maintenance AND Reliability Executives?

I got this question from Marc Laplante of Meridium and it is an excellent question.

I have a contact on LinkedIn who has been wondering if there are any executives out there who actually have Maintenance and/or Reliability in their title. I know of a few. Many reliability and maintenance people lament the fact that there isn't a place at the executive table for maintenance and reliability.

I would like to know if you have an opinion on this and if you've written anything on this subject?

The short answer is, I don't know. Here are my thoughts on why it may be.

The Concepts are Simple, Execution is Hard
Maintenance is pretty straight forward. Best practices have been around for 50 - 100 years, everybody understands what they should be doing.   The only problem is the equipment won't cooperate, it keeps breaking. If you can get past that problem, the operators won't cooperate, they keep breaking the equipment. If you can cure those problems, all the excitement goes out of maintenance, it's boring, your favorite maintenance guy can't rush in to save the day and be the hero, he becomes like the accountant, a bookish indvidual with glasses who sits around planning all day. Maintenance fades into the background and gets marginalized.

Reliability is a bit newer and a bit more difficult. First of all there is the delayed effect. I like to draw parallels between safety and reliability because at their core, they are similar. Safety is keeping people out of harm,s way. Reliability is keeping equipment out of harm's way. The tools are similar, both are based on risk, good planning, good design, and thinking about things before you do them. There is one big difference though. When you get hurt, you know it immediately. You feel pain, you see blood. Cause and effect happen close together. When you injure your equipment it may not show up for some time but it still shows up as increased maintenance, early failure. In the case of reliability, the cause is often separated in time from the effect.   This makes it difficult for people to see the immediate impact of their actions. It makes it harder to figure out the cause sometimes.   It makes it easy for one person to damage the equipment and be rewarded while sandbagging some unknown person in the future who suffers the consequence and is punished.

Reliability is also fairly new. Most of the concepts come are about 60 years old. Deming and Juran were the early pioneers and are virtually unknown outside a few specialized communities in manufacting in their home country, the USA. They are national heroes in Japan. These two guys along with the dedication and hard work of the Japanese people are responsible for the excellent quality products we get today from Japan and from plants that emulate them in other parts of the world, including here in the USA. In addition to the concepts being fairly new (the company I work for is 125 years old as a frame of reference). In addition to being new, some of the tools are mathematically intensive, so we have only had the computing capacity to do reliability analysis on a large scale, cheaply, for the past 15 - 20 years.

Execution is tough, because tough decisions have to be made to move out of the reactive realm. First you have to spend some time up front understanding your equipment before your reliability efforts take hold. Many companies are focused on last month, last quarter, last year and this month, next month, next quarter and this year. This is especially true for public companies. Even if you do the quickest things like Operator Care and implementing a risk framework for decision making, it still takes 6 - 12 months for it to pay back and it does not undue all the damage that is already done to the equipment.   It takes several years for the damages/defects to work their way out of the system. It also takes a long time to get the equipment life planning done (RCM, FMEA, etc) which makes reliability sustainable. It takes enlightened leadership, commitment and company support over an extended time to get all the benefits from a reliability program.

The Generation Gap
Most maintenance executives today grew up before the computing power was around and never were reliability engineers.   When I started working in the oil industry in the late 1980's, the term reliability engineer was synonomous with rotating equipment engineer and this is still true in many companies. So many maintenance executives today can talk about it but have never done it and don't truly understand it. (My apologies to my peers whom I may have offended with that comment. It's my opinion, based on my experience.)   It is only now that younger people who were reliability engineers in the 90's are moving into those more senior management positions.

High Performers
Many good companies promote high performers quickly. Even in the best companies, these people are identified early and rapidly promoted to senior manager levels in 10 - 15 years rotating quickly through a lot of jobs across the business. These people are usually smart and have good leadership skills and they lack experience because they moved so quickly. In many companies, if high performers do not make mistakes, they continue to move up. In these companies, once you recognize you are a high performer, you tend not to rock the boat so you don't get stuck halfway up the ladder (how's that for a mixed metaphor?). These are not usually the people who will lead your change effort because it requires them to stick their neck out. Because they are moving fast and feel they need to make an impact to move up, these are often the people who kill a young reliability program through some cost cutting or personnel reduction effort before the organization is ready for it. They need results now to show they are worthy.   Few companies reward leaders for coming in and following the existing processes and keeping things on track. Managers have to make an impact to prove their worth.

Reliability Guys are Technical Guys
Reliability guys are usually technical guys. Being a technical guy myself, I will say that a large majority of technical guys, including me, are weak on people skills. We're bad communicators. We get mad at people when they can't understand what we are talking about. We are easily frustrated with stupid policies, rules, regulations and hard headed people that don't want to change the way they work.   So we are often labeled as complainers and not listened to. We often have a hard time speaking the language of management (money, if you were wondering). We don't put our reasons for doing things in monetary terms for management. It's hard for us to quantify. My point in saying all this, is a lot of technical guys get relegated to a technical track and never move into management. Many don't want to. I would be the same had the Navy not spent a lot of time and money teaching me and developing my leadership skills. One of the few things of value I took away from my MBA education was a study done by Harvard Business School. I don't remeber the name of the study or who did it but the study reviewed successful executives across a range of industries and rated them on Technical Skills and Political Skills.   Of these top exeutives about 70% had excellent Political Skills and poor Technical Skills, 20% had excellent Technical Skills and poor Political Skills, and only about 10% were good at both (think Jack Welch here).

Top Leadership
At most companies, the top leadership positions seldom come from the engineering ranks. More often top leaders come from Marketing, Production, Supply, or Finance. This will vary somewhat by industry but you get the idea. These people only look at maintenance as a necessary expense and a large one at that. They are only interested in seeing it go down as a % of revenue. They don't see maintenance as a valuable contributor to company profitability. They see it as something we have to do to stay in business, a necessary evil.

The Frederick Taylor - Top Down, People are Machines Business Model
Last but not least, in the USA we were blessed that World War II was not fought on our soil. We built up our manufacturing base and supplied the world because everybody else's manufacturing was destroyed during that war. Unfortunately that blessing has turned into a curse in the last 30 years as everybody else built new, modern factories and our manufacturing base in the US eroded to almost nothing. Part of this was complacency and part of this is we were working on the outdated thinking of Fredrick Taylor who's ideas left us with a system where management makes all the decisions and the workers are treated like automatons (robots) with no brains and no skin in the game. The US is still operating on this model while the much of the rest of the world is using a model more like Japan (thanks to Deming and Juran). Engaging the entire knowledge of the workforce and knocking down silo's is the best way to get the entire workforce giving 100% effort.   But this topic is probably worth a whole other blog entry.

How do we change things?
Those of us who see it clearly must try to lead the way and convince others. I don't know of any other way. We have to demonstrate the value of our point of view by delivering results. We have to try to educate our peers. We have to educate younger people coming into the workforce. We have to share our knowledge. The marketplace will help us as China and India and the fairly new European Union start to exert more influence in the world and take the leadership reigns from the US, we need to stay in the game and be a player rather than fading into the background like England did.

How to Improve Reliability

Thursday, February 1, 2018

Monday, May 24, 2010

How Do I Implement PSM

Friday, April 30, 2010

Why Are There Few Maintenance AND Reliability Executives?

Followers

Topics

Where Can I Learn More?

About Me

How to Improve Reliability

Thursday, February 1, 2018

Monday, May 24, 2010

How Do I Implement PSM

Friday, April 30, 2010

Why Are There Few Maintenance AND Reliability Executives?

Followers

Topics

Steve's Reliability Blog

Where Can I Learn More?

About Me