Food for Thought-An e-newsletter published by Software Quality Consulting, Inc. June 2007, Vol. 4 No. 6 Mission Critical Software What topics would you like to see in this newsletter? Each month, this newsletter tries to provide you with useful information. This is a two-way street and your feedback is important. Please send your thoughts and comments to steve@swqual.com. -------------------------------------------------------------------------------- Welcome to Food for Thought(TM), an e-newsletter from Software Quality Consulting (http://www.swqual.com/index.html?Intro). I've created free subscriptions for my valued business contacts. If you find this newsletter informative, I encourage you to continue reading. Feel free to pass this newsletter along to colleagues by clicking this Forward Email link (http://ui.constantcontact.com/roving/sa/fp.jsp?plat=i&p=f&m=sctz69n6). If you’ve received this newsletter from a colleague and would like to subscribe, please click this Enter New Subscription link (http://www.swqual.com/ newsletter/Subscribe.htm?Newsletter). If you don't wish to receive this newsletter, click the SafeUnSubscribe(TM) link at the bottom of this newsletter, and you won’t be bothered again. Your continued feedback on this newsletter is most welcome. Please send your comments and suggestions to info@swqual.com. -------------------------------------------------------------------------------- *** In This Issue *** In This Months’ Topic, I discuss techniques for delivering software that is mission-critical, highly reliable, and of high quality... Regular features to look for each month are: - Monthly Morsels Hints, tips, techniques and reference info related to this month’s topic - Calendar Conferences, workshops, and meetings of interest to software engineers, QA engineers and anyone interested in software development -------------------------------------------------------------------------------- *** This Month’s Topic *** Mission Critical Software Software has become a critical component of every sector of our economy. Many companies develop products that have software embedded within them. Cars, cell phones, ATM machines, medical devices, airport security systems, airplanes, and millions of other products all depend on software. And yet, all software is defective (http://www.swqual.com/newsletter/ vol2/no11/vol2no11.html)! Some product features being implemented in software today are truly beyond belief. For example, consider some of the features available on the 2007 Lexus LS: Self-parking. Self-parking used to mean you parked your car yourself rather than have the valet do it for you. This car adds new meaning to self-parking since it can actually parallel park itself (http://www.lexus.com/ models/LS/features/exterior/advanced_parking_guidance_system.html? demo=ls_parking&s_ocid=30019). Pre-Collision System. A front-mounted millimeter-wave radar sensor constantly monitors the distance and closing speed of a vehicle ahead. When the software determines that a frontal collision is unavoidable (http://www.autospies.com/video/001-shows-the-Lexus-LS-pre-collision- system-in-action-209/), the software preemptively tightens the front seatbelts and preps the brakes for increased braking pressure the moment the driver steps on the brake pedal. Smile, you’re on candid camera! A special face detection (http://www.engadget.com/2007/05/03/lexus-ls600hls-face-detection-camera- warning-system-get-spied/) camera mounted above the steering column determines if the driver is daydreaming and not paying attention to the road. If the face detection software determines the driver is not paying attention to the road, it provides audible and visual alarms to get the driver’s attention. If that doesn’t work, it gradually slows the car down and tightens the seat belts. And the list goes on and on... How many lines of code would you guess are kickin’ around inside that Lexus? By 2010, General Motors [2] estimates their cars will have over 100 million lines of code. Let me say that again – soon there will be 100 million lines of code running in your car! Given the track record of the auto industry, how reliable would you guess all of that code will be? Recalls have plagued the automotive industry. Many models built since 2000 have had software recalls. Even cars known for quality, such as Toyota, Mercedes-Benz and BMW have had recalls related to software problems. It should come as no surprise to learn that the auto industry spends about $2 billion annually to correct software defects in cars... Read further about software-related automotive recalls (http://www.wired.com/cars/coolwheels/news/2004/06/63846?currentPage=2) Given the amount of code embedded in today’s cars and the complexity inherent in this code, it’s entirely possible that the next time you bring your car in for an oil change you may get a software update as well. And, given that more and more cars have GPS and satellite communications capability, it’s possible that in the not-too-distant future, software updates, fixes, hot patches and maintenance releases may all be downloaded to your car. And of course, there will eventually be hacks for all of this software... I don’t know about you, but this makes me want to keep my car in the garage. IS YOUR SOFTWARE MISSION CRITICAL? Mission critical software refers to software that must work reliably in order for the “mission” to be successful. If the software fails, it’s likely the mission will fail as well. Does your company develop “mission critical” software? It seems that more and more companies are... It wasn’t that long ago when the term “mission critical” was only applied to systems developed by the US DoD and NASA. But today, more and more software is deemed “mission critical”. Think about all the software in the Lexus LS. If it fails to work as advertised and someone is injured or killed as a result, do you think that the company will be sued? Of course they will. The software developed for the Space Shuttle’s on-board computers is truly mission critical and arguably the most reliable software ever developed. Consider what this software does: “At T-minus 6.6 seconds, if the pressures, pumps, and temperatures are nominal, the computers give the order to light the shuttle main engines -- each of the three engines firing off precisely 160 milliseconds apart, tons of super-cooled liquid fuel pouring into combustion chambers, the ship rocking on its launch pad, held to the ground only by bolts. As the main engines come to one million pounds of thrust, their exhausts tighten into blue diamonds of flame. Then and only then at T-minus zero seconds, if the computers are satisfied that the engines are running true, they give the order to light the solid rocket boosters. In less than one second, they achieve 6.6 million pounds of thrust. And at that exact same moment, the computers give the order for the explosive bolts to blow, and 4.5 million pounds of spacecraft lifts majestically off its launch pad. The software gives the orders to gimbal the main engines, executing the dramatic belly roll the shuttle does soon after it clears the tower. The software throttles the engines to make sure the craft doesn't accelerate too fast. It keeps track of where the shuttle is, orders the solid rocket boosters to fall away, makes minor course corrections, and after about 10 minutes, directs the shuttle into orbit more than 100 miles up. When the software is satisfied with the shuttle's position in space, it orders the main engines to shut down -- weightlessness begins and everything starts to float.” [1] Nothing could be more mission critical than launching the Space Shuttle. And every one of the thousands of critical decisions that need to be made during the launch process is made by software – software that has to work reliably. The Shuttle uses five identical IBM 32-bit general purpose computers. Four redundant computers run identical software - operating in lockstep, constantly checking each other. If one fails, the three functioning computers "vote" it out of the system. This isolates it from vehicle control. If a second computer of the three remaining fails, the two functioning computers vote it out. In the rare case of two out of four computers simultaneously failing (a two-two split), one group is picked at random. A fifth backup computer system runs different software developed by a different company. The fifth computer is used only if the entire four-computer primary system fails. Redundancy has been a basic principle used to improve hardware reliability - which is why the Shuttle has four computers. However, redundancy has no impact on improving software reliability. Replicating the same potentially defective code in multiple computers could be catastrophic. That’s why the Space Shuttle designers added the fifth computer with software developed by a different team. And there are some significant differences in the principles that can be used to improve software reliability... Learn more about Software Reliability... (http://www.swqual.com/newsletter/vol4/no6/Measuring%20Software%20 Reliability.pdf) SOFTWARE FOR THE SPACE SHUTTLE ON-BOARD COMPUTERS The Space Shuttle’s on-board software was developed by a highly talented and motivated team of software engineers at a division of Lockheed-Martin that is SEI Level 5. The software developed for each of the four primary computers consists of about 420,000 lines of code written in a real-time programming language called HAL/S (http://en.wikipedia.org/wiki/HAL/S). Some members of the team have been working together for almost 20 years. And their accomplishments have been truly amazing. What They Achieved The last three versions of software released had just one reported defect in each version. The last 11 releases had a grand total of 17 reported defects. If this software was developed using typical commercial software development methods, it has been estimated that there would have been at least 5,000 defects. Given the inherent complexity of this code and its modest size, this is truly an amazing accomplishment. When I first read about this accomplishment, two questions came to mind – how did they do this and what can be learned from this experience... How They Achieved It The software engineering team, working at the Johnson Space Center in Clear Lake Texas, believed in four key principles: - The product is only as good as the product specifications... - The best teamwork results from a healthy rivalry... - Record the genealogy of every line of code... - Don't just fix defects - fix what allowed the defect to occur in the first place... Let’s look at these four principles and see what we can learn... 1.The Product is only as good as the product specs The team spent much of their time writing and reviewing specs. Specs were written to excruciating detail – a level of detail commonly found in blueprints. Everything in the specs had to be reviewed, understood, and agreed to by the software engineering team – before anyone wrote any code. And nothing could be changed without agreement and understanding. Why did they do this? Because quite often, astronauts who were to fly the Space Shuttle participated in these reviews. In the words of one software engineer: “If the software isn't perfect, some of the people we go to meetings with might die.” [1] As a result of this painstaking attention to detail - “The shuttle group produces grown-up software, and the way they do it is by being grown-ups.” [1] 2.The best teamwork results from a healthy rivalry The healthy rivalry on this team was between software engineers and testers. On this team, software engineers were expected to deliver code as defect-free as humanly possible. Testers would routinely beat on the code with flight scenarios and simulations that would hopefully reveal as many defects as possible. The result was a “friendly adversarial relationship” that produced simply incredible results: The team found 85% of its defects before formal testing began and 99.9% of its defects before software was delivered to NASA. 3.Record the genealogy of every line of code Not only was every line of code documented, but everything that happened to every line of code was recorded showing every time it was changed, why it was changed, when it was changed, purpose of change, specs that detail changes, who reviewed and approved change, etc. Every single defect ever found while writing or working on the software has been recorded, going back almost 20 years. Using all of this genealogy data, the team developed models that predict how many defects there are likely to be in each new version of software. As a result, if developers and testers find too few defects, everyone looks harder until reality and predictions match. 4.Don't just fix defects - fix what allowed the defect in the first place The team didn’t blame people for defects – they blamed the process. As a result, the process was constantly analyzed to discover why and how defects got through. And accountability is a team concept - no one person is solely responsible for writing or inspecting code; it’s a shared responsibility. The process not only finds defects in software - the process finds defects in the process. WHAT CAN WE LEARN? Based on the extraordinary results achieved by this team here are a few things we can learn about developing mission critical software: - Good requirements are essential to produce reliable software To deliver grown-up software, we need to behave like grown-ups. That means starting with clear, unambiguous requirements. Write requirements that are testable... (http://www.swqual.com/newsletter/vol3/no3/vol3no3.html) - Use historical data to predict the defect injection rate What is your organization’s defect injection rate? If you don’t know this, you have no way of knowing how many defects you haven’t found. You need to collect data so that you can accurately predict the number of defects injected in every release... Estimate the number of unknown defects in a release... (http://www.swqual.com/newsletter/vol2/no11/vol2no11.html) - Development and Test are viewed as peers - each with an equal stake in the outcome Developers and testers must be able to work cooperatively. Developers should be expected to deliver code that is as defect-free as humanly possible. Testers should be expected to find the defects developers don’t find. You should know about how many defects there are in a given release. And for mission critical software, you’re not done until you’ve found as many of them as humanly possible. - Blame the process and not people for failures and trust process to be self-correcting People will always make mistakes. We need effective processes that help us find most of them and then help identify what aspects of the process need to be changed to ensure that more problems are detectable... SUMMARY Developing mission critical software carries with it some serious responsibilities. Mission critical software needs to be highly reliable and robust. There are several techniques that can be used to measure software reliability (http://www.swqual.com/newsletter/vol4/no6/Measuring%20Software%20 Reliability.pdf). Measuring software reliability is critical because we can only improve that which we can measure. Stay tuned for a new workshop – Improving Software Reliability – to be sponsored by the IEEE Boston Section and presented in the Boston area this fall. The next Food for Thought TM e-newsletter will be published in September. Have a safe and relaxing summer... -------------------------------------------------------------------------------- *** Monthly Morsels *** Every month in this space you’ll find additional information related to this month’s topic. - References: [1] Fishman, C., “They Write the Right Stuff”, Fast Company, Dec 1996 [2] Charette, R., “Why Software Fails”, IEEE Spectrum, September 2005 - On-line Resources: Motor Industry Software Reliability Association (UK) MISRA (http://www.misra-c2.com/) -------------------------------------------------------------------------------- *** Calendar *** Every month you’ll find news here about local and national events that are of interest to the software community... - Software Quality Calendar There are many organizations that sponsor monthly meetings, workshops, and conferences of interest to software professionals. Find out what’s happening... (http://www.swqual.com/links/upcoming.html) - Workshops Offered by Software Quality Consulting Software Quality Consulting offers workshops in many topics related to software process improvement. Get more info... (http://www.swqual.com/seminars/courses.html) -------------------------------------------------------------------------------- *** About SQC *** Software Quality Consulting provides consulting, training, and auditing services tailored to meet the specific needs of clients. We help clients fine-tune their software development processes and improve the quality of their software products. The overall goal is to help clients achieve Predictable Software Development(TM) – so that organizations can consistently deliver quality software with promised features in the promised timeframe. To learn more about how we can help your organization, visit our web site (http://www.swqual.com/index.html?AboutSQC) or send us an email (info@swqual.com). -------------------------------------------------------------------------------- I hope this newsletter has been informative and helpful. Your comments and feedback are most welcome. Send me your feedback... (info@swqual.com) Thanks, Steve Rakitin info@swqual.com Food for Thought, Predictable Software Development, Act Like a Customer, and ALAC are trademarks of Software Quality Consulting, Inc. Copyright 2007. Software Quality Consulting, Inc. All rights reserved. Graphic design by Sage Studio