Lean Posts

“We can acquire knowledge from doing something incorrectly, but only if we can determine the cause of the error and correct it.”

Russell Ackoff

Summary: It’s never too late to start applying five whys, even if you’re saddled with zillions of lines of legacy code. Just start asking why whenever you find a problem—you’ll automatically start fixing the 20% of underlying issues that cause 80% of your problems. Five whys was first discovered by Toyota—if it can work for cars, it can work for you.

This is a guest post by Eric Ries, a founder of IMVU and an advisor to Kleiner Perkins. Eric also has a great blog called Startup Lessons Learned.

In Part 1, I described how to use five whys to discover the root cause of problems, make corrections, and build an immune system for your startup. In Part 2, I explained how to get started with five whys and how IMVU built a startup immune system by applying five whys for months and years. In this final part, I’ll describe how to apply five whys to legacy startups.

It’s never too late to start asking why.

When I explain five whys to entrepreneurs and big-company types alike, I sometimes get this response: “Well, sure, if you start out with all those great tools, processes and TDD from the beginning, that’s easy! But my team is saddled with zillions of lines of legacy code and… and…”

So let me say for the record: we didn’t start with any of this at IMVU. We didn’t even practice TDD across our whole team. We’d never heard of five whys, and we had plenty of “agile skeptics” on the team. By the time we started doing continuous integration, we had tens of thousands of lines of code that wasn’t under test coverage.

But the great thing about five whys is that it has a Pareto principle built right in. Because the most common problems keep recurring, your prevention efforts are automatically focused on the 20% of your product that needs the most help. That’s also the same 20% that causes you to waste the most time. So five whys pays for itself awfully fast, and it makes life noticeably better almost right away. All you have to do is get started.

If it works for cars, it can work anywhere.

So thank you, Taiichi Ohno. I think you would have liked seeing all the waste we’ve been able to drive out of our systems and processes, all in an industry that didn’t exist when you started your journey at Toyota.

And I especially thank you for proving that this technique can work in one of the most difficult and slow-moving industries on earth: automobiles. You’ve made it hard for any of us to use the most pathetic excuse of all: “Surely, that can’t work in my business, right?” If it can work for cars, it can work for you.

What are you waiting for?

“For every dollar spent in failure, learn a dollar’s worth of lesson.”

Jesse Robbins, Amazon’s former Master of Disaster

Summary: Get started with five whys by applying it to a specific team with a specific problem. Select a five whys master to conduct a post mortem with everyone who was involved in the problem. Email the results of the analysis to the whole company. Repeatedly applying five whys at IMVU created a startup immune system that let our developers go faster by reducing mistakes.

This is a guest post by Eric Ries, a founder of IMVU and an advisor to Kleiner Perkins. Eric also has a great blog called Startup Lessons Learned.

In Part 1, I described how to use five whys to discover the root cause of problems, make corrections, and build an immune system for your startup. So…

How do you get started with five whys?

I recommend that you start with a specific team and a specific class of problems. For my first time, it was scalability problems and our operations team. But you can start almost anywhere—I’ve run this process for many different teams.

Start by having a single person be the five whys master. This person will run the post mortem whenever anyone on the team identifies a problem.

But don’t let the five whys master do it by himself; it’s important to get everyone who was involved with the problem (including those who diagnosed or debugged it) into a room together. Have the five whys master lead the discussion and give him or her the power to assign responsibility for the solution to anyone in the room.

Distribute the results of five whys to the whole company.

Once that responsibility has been assigned, have that new person email the whole company with the results of the analysis. This last step is difficult, but I think it’s very helpful. Five whys should read like plain English. If they don’t, you’re probably obfuscating the real problem.

The advantage of sharing this information widely is that it gives everyone insight into the kinds of problems the team is facing, but also insight into how those problems are being tackled. If the analysis is airtight, it makes it pretty easy for everyone to understand why the team is taking some time out to invest in problem prevention instead of new features.

If, on the other hand, it ignites a firestorm—that’s good news too. Now you know you have a problem: either the analysis is not airtight, and you need to do it over again, or your company doesn’t understand why what you’re doing is important. Figure out which of these situations you’re in, and fix it.

What happens when you apply five whys for months and years?

Over time, here’s my experience with what happens.

People get used to the rhythm of five whys, and it becomes completely normal to make incremental investments. Most of the time, you invest in things that otherwise would have taken tons of meetings to decide to do.

You’ll start to see people from all over the company chime in with interesting suggestions for how you could make things better. Now, everyone is learning together—about your product, process, and team. Each five whys email is a teaching document.

IMVU’s immune system after years of five whys.

Let me show you what this looked like after a few years of practicing five whys in the operations and engineering teams at IMVU. We had made so many improvements to our tools and processes for deployment, that it was pretty hard to take the site down. We had five strong levels of defense:

  1. Each engineer had his/her own sandbox which mimicked production as close as possible (whenever it diverged, we’d inevitably find out in a five whys shortly thereafter).
  2. We had a comprehensive set of unit, acceptance, functional, and performance tests, and practiced TDD across the whole team. Our engineers built a series of test tags, so you could quickly run a subset of tests in your sandbox that you thought were relevant to your current project or feature.
  3. 100% of those tests ran, via a continuous integration cluster, after every checkin. When a test failed, it would prevent that revision from being deployed.
  4. When someone wanted to do a deployment, we had a completely automated system that we called the cluster immune system. This would deploy the change incrementally, one machine at a time. That process would continually monitor the health of those machines, as well as the cluster as a whole, to see if the change was causing problems. If it didn’t like what was going on, it would reject the change, do a fast revert, and lock deployments until someone investigated what went wrong.
  5. We had a comprehensive set of nagios alerts, that would trigger a pager in operations if anything went wrong. Because five whys kept turning up a few key metrics that were hard to set static thresholds for, we even had a dynamic prediction algorithm that would make forecasts based on past data, and fire alerts if the metric ever went out of its normal bounds. (You can read a cool paper one of our engineers wrote on this approach.)

A strong immune system lets you go faster by reducing mistakes.

So if you had been able to sneak into the desk of any of our engineers, log into their machine, and secretly check in an infinite loop on some highly-trafficked page, somewhere between 10 and 20 minutes later, they would have received an email with a message more-or-less like this:

“Dear so-and-so, thank you so much for attempting to check in revision 1234. Unfortunately, that is a terrible idea, and your change has been reverted. We’ve also alerted the whole team to what’s happened, and look forward to you figuring out what went wrong. Best of luck, Your Software.”

OK, that’s not exactly what it said. But you get the idea.

Having this series of defenses was helpful for doing five whys. If a bad change got to production, we’d have a built-in set of questions to ask: Why didn’t the automated tests catch it? Why didn’t the cluster immune system reject it? Why didn’t operations get paged? And so forth.

And each and every time, we’d make a few more improvements to each layer of defense. Eventually, this let us do deployments to production dozens of times every day, without significant downtime or bug regressions.

In Part 3, I’ll show you how to apply five whys to “legacy” startups.

“When confronted with a problem, have you ever stopped and asked why five times?”

Taiichi Ohno

Summary: Whenever you find a defect, ask why five times to discover the root cause of the problem. Then make corrections at every level of the analysis. By applying five whys whenever you find a defect, you will (1) uncover the human problems beneath technical problems and (2) build an immune system for your startup.

This is a guest post by Eric Ries, a founder of IMVU and an advisor to Kleiner Perkins. Eric also has a great blog called Startup Lessons Learned.

Taiichi Ohno was one of the inventors of the Toyota Production System. His book, Toyota Production System, is a fascinating read, even though it’s decidedly non-practical. After reading it, you might not even realize that there are cars involved in Toyota’s business. Yet there is one specific technique that I learned most clearly from this book: asking why five times. I believe this is a critical lean startup technique.

When something goes wrong, we tend to see it as a crisis and seek to blame. A better way is to see it as a learning opportunity. Not in the existential sense of general self-improvement. Instead, we can use the technique of asking why five times to get to the root cause of the problem and make corrections.

Ask why five times whenever you discover a defect.

Here’s how it works. Let’s say you notice that your website is down. Obviously, your first priority is to get it back up. But as soon as the crisis is past, have the discipline to conduct a post-mortem in which you start asking why:

  1. Why was the website down? The CPU utilization on all our front-end servers went to 100%.
  2. Why did the CPU usage spike? A new bit of code contained an infinite loop!
  3. Why did that code get written? So-and-so made a mistake.
  4. Why did his mistake get checked in? He didn’t write a unit test for the feature.
  5. Why didn’t he write a unit test? He’s a new employee, and he was not properly trained in Test Driven Development (TDD).

Make five corrections.

So far, this isn’t very different from the kind of analysis any competent operations team would conduct for a site outage. The next step is this: you have to commit to making a proportional investment in corrective action at every level of the analysis. So, in the example above, we’d have to take five corrective actions:

  1. Bring the site back up.
  2. Remove the bad code.
  3. Help so-and-so understand why his code doesn’t work as written.
  4. Train so-and-so in the principles of TDD.
  5. Change the new engineer orientation to include TDD.

Making corrections builds your startup immune system.

I have come to believe that this technique should be used for all kinds of defects, not just site outages. Each time, we use the defect as an opportunity to find out what’s wrong with our process, and make a small adjustment.

By continuously adjusting, we eventually build up a robust series of defenses that prevent problems from happening. This approach is at the heart of breaking down the “time/quality/cost, pick two” paradox, because these small investments cause the team to go faster over time.

5 whys uncovers the human problems beneath technology problems.

In the example above, what started as a technical problem actually turned out to be a human and process problem. This is completely typical. Our bias as technologists is to focus on the product part of the problem, and five whys tends to counteract that tendency.

It’s why, at my previous job, we were able to get a new engineer completely productive on their first day. We had a great on-boarding process, complete with a mentoring program and a syllabus of key ideas to be covered. Most engineers would ship code to production on their first day.

Make your corrections proportional to the cost of the defect.

We didn’t start with a great program like that, nor did we spend a lot of time all at once investing in it. Instead, five whys kept leading to problems caused by an improperly trained new employee, and we’d make a small adjustment. Before we knew it, we stopped having those kinds of problems altogether.

So it’s important to remember the proportional investment part of the rule above. It’s easy to decide that when something goes wrong, a complete ground-up rewrite is needed. It’s part of our tendency to focus on the technical and to overreact to problems.

If you have a severe problem, like a site outage, that costs your company tons of money or causes lots of person-hours of debugging, go ahead and allocate about that same number of person-hours or dollars to the solution.

The budget for corrections should be, in total, proportional to the cost of the defect that triggered the five whys. So, if the site was down and five people burned a whole day on it, maybe five man-days of fixing is appropriate. But if the problem cost three customers 25 cents each, maybe only a few hours is appropriate.

But always have a maximum, and always have a minimum. For small problems, just move the ball forward a little bit. Don’t over-invest. If the problem recurs, five whys will give you a little more budget to move the ball forward some more. You can keep your cool because five whys will be there if the problem recurs.

In Part 2, I’ll describe how to get started with five whys.

“We are using Pivotal Tracker to manage all of our new web apps under development, this thing rocks.”

Ezra Zygmuntowicz, Founder, Engine Yard

“It’s a relief to open Tracker at the start of the day and focus on the next most important task.”

Aaron Peckham, Founder, Urban Dictionary

No matter what you’re using for project management, take a close look at Pivotal Tracker. I’ve tried Bugzilla, Trac, Basecamp, FogBugz, Microsoft Project, and Lighthouse—and Tracker is the best for my needs. I’ve shown Tracker to many startups and many have made the switch.

10 reasons I like Tracker.

  1. It’s free.
  2. It’s hosted.
  3. It’s a joy to use. It’s the iPod of project management software. It’s all drag-and-drop and clickity-clack and it just works.
  4. It’s multi-user. Your co-founder in North Korea can make changes in Tracker and you will see them instantly. No page reloads.
  5. It’s for lean startups. The building block in Tracker is a story: an increment of customer value that you deliver with minimal waste.
  6. It’s about completing your next most important task—not maintaining mile-long to-do lists, Gantt charts, and lists of bugs.
  7. It’s transparent. Everybody on the team knows what everybody else is working on, their priorities, and their accomplishments.
  8. It’s in sync with reality. It doesn’t take time to keep your requirements and schedule in sync with reality, even if your business priorities change daily.
  9. It doesn’t do much. No, it doesn’t do dependencies and critical paths. It just keeps you focused on delivering value to customers.
  10. It’s powerful as hell. Tracker hides a lot of technology under a simple interface. It’s a serious Javascript-intensive web application that’s in the same league as Gmail and Google Maps.
  11. Bonus reason: Everything is on one page—there’s no need to navigate around (unlike other project management tools). More Gmail, less Hotmail.

If it isn’t clear by now, Tracker isn’t a bug manager posing as project management software.

If you’re already lean, Tracker is a no-brainer. If you’re not lean, Tracker is a good way to start getting lean.

What do other folks say about Tracker?

Read the testimonials from people who are using Tracker. I particularly like this one from Aaron Peckham, the founder of Urban Dictionary:

“I leave Tracker open all day. I use it for documenting, estimating and prioritizing things that need to be done. It’s a relief to open Tracker at the start of the day and focus on the next most important task. It keeps me from getting distracted and having too many things going at the same time. It also serves as documentation of what I’ve completed in the past—to show that I’m making good use of my time.”

Want more opinions? See what people are saying about Pivotal Tracker on Twitter.

What do you think about Tracker?

If you give Tracker a try, please let us know what you think!

William Pietri left a great comment on Books for Entrepreneurs: Agile Software Development:

“Great to see these approaches getting more attention in the startup world. I’ve been soaking in both agile methods and startup companies a long time, and I think they go perfectly together. They provide just enough structure to make everybody effective, without unnecessary constraints or process bloat.

“One of my clients, sidereel.com started with an XP-ish process from the first week. They had an alpha for investors in 2 months, a private beta in 3, and a public beta in 4 months. Now they’re happily funded, up to a dozen people, and just shy of Alexa 1000 site [emphasis added].  Weekly iterations meant they always had new progress to show potential investors. And being able to change direction easily meant they could try a lot of things out and invest heavily in areas the users liked.

“Speaking of which, I and a colleague are interested in trying out some variations of the Planning Game with a couple of user-focused startups. If any VH readers want to be guinea pigs, we’re looking for Bay Area teams that are early in the process, actively struggling to put together a product plan, and have both business and technical people involved full time. If there’s anybody here that meets those criteria, just drop me a line. My email address is my first name at my domain name [scissor.com].”

Also check out An XP Team Room, where William walks us through the offices of a lean startup in gory detail:

“For heavens sake, if you haven’t gotten comfy with Agile techniques and thinking, get on it right now.”

Tim Bray, Editor of XML 1.0

Summary: Start learning how to be lean by reading Agile Software Development. It isn’t the cheapest book in the world but it’s one of the cheapest investments you will make in your startup. It includes 14 simple and counterintuitive practices that will help you engineer your engineering and product teams. This post includes several excerpts from the book.

In Lean startups find their moment, we defined lean as the never-ending process of eliminating waste. We described its benefits: eliminating waste makes your business fast, cheap, high quality, and effective—yes, all the good stuff, all at once. We defined waste as anything that is not absolutely necessary for creating customer value. And we described the two greatest wastes: overproduction and inventory.

We also said that lean probably seems abstract so far. Time to make it more concrete.

Agile Software Development

We first learned how to be lean from a book called Agile Software Development by Robert “Uncle Bob” Martin. Uncle Bob is one of the creators of agile and agile is just another word that software developers use instead of lean. This isn’t the cheapest book in the world but it’s one of the cheapest investments you will make in your startup.

Chris Sepulveda recommended this book when we hired Pivotal Labs to help us with software development. I remember having one of those aha! moments when I first read it, “Oh, so this is how you quickly translate a founder’s vision into happy customers (hint: it isn’t by specifying the requirements up front). This is how you deliver software on time (hint: it isn’t by working overtime). This is how you eliminate bugs (hint: it isn’t with QA). This is how you make developers effective (hint: it isn’t by putting them in a quiet room by themselves). This is how you know when you’ll be done (hint: it isn’t with a Gantt chart). This is how you engineer an engineering and product team.”

Excerpt

Here’s a excerpt of the simple, lightweight practices in Agile Software Development:

“User Stories

“In order to plan a project, we must know something about the requirements but we don’t need to know very much… we only need to know enough about a requirement to estimate it.

“The specifics of a requirement are likely to change with time, especially once the customer [the founder or product manager] begins to see the system come together. There is nothing that focuses requirements better than seeing the nascent system come to life. Therefore, capturing the specific details about a requirement long before it is implemented is likely to result in wasted effort…

“The Planning Game

“The essence of the planning game is the division of responsibility between business and development. The business people (a.k.a. the customers) decide how important a feature is, and the developers decide how much the feature will cost to implement.

“At the beginning of each iteration [an iteration is one week], the developers give the customers a budget, based on how much they were able to get done in the last iteration. The customers choose stories whose costs total up to, but do not exceed that budget.

“With these simple rules in place, and with short iterations and frequent releases… the customers will be able to determine how long their project will take and how much it will cost.

“Simple Design

“An [agile] team will probably not start with infrastructure. They probably won’t select the database first. They probably won’t select the middleware first. The team’s first act will be to get the first batch of stories working in the simplest way possible. The team will only add the infrastructure when a story comes along that forces them to do so.

“You aren’t going to need it. An [agile] team seriously considers what will happen if they resist the temptation to add infrastructure before it is strictly needed. They start from the assumption that they aren’t going to need that infrastructure.

Test-Driven Development

“All production code is written in order to make failing tests pass. First we write a test that fails because the functionality for which it is testing doesn’t exist. Then we write the code that makes that test pass.

“Once a test passes, it is added to the body of passing tests and is never allowed to fail again. This growing body of tests is run several times per day, every time the system is built. If a test fails, the build is declared a failure. Thus, once a requirement is implemented, it is never broken. The system migrates from one working state to another and is never allowed to be inoperative for longer than a few hours.”

And here’s a pdf of Chapter 1 that I got from Uncle Bob’s website.

By themselves, these practices actually have too many flaws to be effective. Agile Software Development includes the nine remaining practices that you need to make an agile process work.

Parts of this book are for anyone, and parts are for developers. If you’re a founder or a product manager, read everything up to and including Chapter 3. If you’re a developer (I’m not), I’m guessing you should read everything up to and including Chapter 7—or read the whole book.

So what does any of this have to do with eliminating waste? More on that later.

“As an investor and board member, it’s comforting for me to see a team using lean development. It gives me transparency on product development and engineering. I even see it reflected in the way the company manages its business objectives and goals.”

Scott Raney, Redpoint Ventures

Summary: “Lean” is the most capital-efficient way to run a business. Lean is the never-ending process of eliminating waste: finding every activity that does not create value for the customer and eliminating it. The two greatest wastes are overproduction (making things the customer doesn’t want) and inventory (making things that aren’t used immediately).

Every entrepreneur must learn how to run a lean startup (some people say agile instead of lean — same thing). It’s the most capital-efficient way to run a business. It’s how you get to product/market fit. It’s how you do more with less money.

If you’re not lean, getting lean is probably the most effective thing you can do for your business. Smart investors and boards will soon be demanding lean. And smart startups will get lean while the other ones will get left behind.

Q. What’s a lean startup?

Lean startups eliminate waste: they eliminate every activity that is not necessary for creating customer value. If you eliminate enough waste, you can be fast, cheap, high quality, and effective—because more and more of your activities will be creating value for the customer.

In Toyota Production System, Taiichi Ohno (the father of lean) says,

“All we are doing is looking at the timeline… from the moment the customer gives us an order to the point that we collect the cash. And we are reducing that time line by removing the non-value-added wastes…

“True efficiency improvement comes when we produce zero waste and bring the percentage of work to 100 percent:

Present capacity = work + waste.”

Toyota created lean and used it to grow from a small company to the world’s largest automaker. They simply find every activity that doesn’t create value for the customer and eliminate it.

Q. What’s waste?

The two greatest wastes are:

  1. Overproduction: Things the customer doesn’t want. Cars the customer won’t buy. Features the customer doesn’t want. Software the customer won’t purchase.
  2. Inventory: Parts that aren’t used immediately. Mufflers that aren’t in cars. Features that can’t ship because they’re buggy. Code that isn’t in customer hands. Architectures that aren’t coded. Requirements that aren’t coded, shipped, and useful. Most to-do lists.

In Extreme Programming Explained, Kent Beck (the father of Extreme Programming) says:

“Taiichi Ohno, the spiritual leader of [Toyota Production System], says the greatest waste is the waste of overproduction. If you make something and can’t sell it, the effort that went into making it is lost. If you make something internally in the line and don’t use it immediately, its information value evaporates. There are also storage costs: you have to haul it to a warehouse; track it while it is there; polish the rust off it when you take it back out again; and risk that you’ll never use it at all, in which case you have to pay to haul it away.

“Software development is full of the waste of overproduction: fat requirements documents that rapidly grow obsolete; elaborate architectures that are never used; code that goes months without being integrated, tested, and executed in a production environment; and documentation no one reads until it is irrelevant or misleading. While all of these activities are important to software development, we need to use their output immediately in order to get the feedback we need to eliminate waste.

“While individual machines may work more smoothly with lots of… inventory, the factory… as a whole doesn’t work as well. If you use a part immediately you get the value of the part itself as well as information about whether the upstream machine is working correctly… Parts aren’t just parts but also information…

“Requirements gathering, for instance, will not improve by having ever more elaborate requirements-gathering processes but by shortening the path between the production of requirements… and the deployment of the software specified… Requirements gathering isn’t a phase that produces a static document; but an activity producing detail, just before it is needed, throughout development.”

Q. How do you get lean?

Lean probably seems pretty abstract so far. Next up are a few posts describing specific ways to be lean and eliminate waste.

Thanks: To Fred Wilson for inspiring the title of this article with his post, Capital Efficiency Finds Its Moment.

Go read Eric Ries’ new blog: Startup Lessons Learned. He’s a Venture Advisor at KPCB and a co-founder, CTO, and VPE of IMVU.

Eric blogs about one of my favorite topics: applying lean/agile to startups. Lean thinking is the number one thing you can do to make your startup more effective. His post on A new version of the Joel Test is a great place to start…

On board meetings:

At IMVU, we opened up our board meetings to the whole company, and invited all of our advisers to boot. Sometimes it put some serious heat on the management team, but it was well worth it because everyone walked out of that room feeling at a visceral level the challenges the company faced.”

On solitary programmers:

It’s not true that energized programmers primarily do solitary work; certainly that’s not true of the great agile teams I’ve known. Instead, teams should have their own space, under their control, with the tools they need to do the job.”

On schedules:

Agile team-building practices make scheduling per se much less important. In many startup situations, ask yourself “Do I really need to accurately know when this project will be done?” When the answer is no, we can cancel all the effort that goes into building schedules and focus on making progress evident. Everyone will be able to see how much of the product is done vs undone, and see the finish line either coming closer or receding into the distance. When it’s receding, we rescope.”

On QA:

“Imagine a world where your QA team never, ever worries about bug regressions. They just don’t happen. All of their time is dedicated to finding novel reproduction paths for tricky issues. That’s possible now, and it means that the historical ratio of QA to engineering is going to have to change (on the other hand, QA is now a lot more interesting of a job).”

SEM on five dollars a day is another great post among many. Thanks to Andrew Chen for bringing this blog to my attention.

The Laws of Productivity is a must-read presentation for startups that want to be more productive. Here’s a direct link to the pdf. And here’s my summary of the presentation:

Individuals

  1. Work 40 hours a week. (Working more feels like you’re doing more, but you’re actually doing less.)
  2. Work below capacity (say 80%) during those 40 hours.
  3. Consider spreading 40 hours across 4 days instead of 5.
  4. Get the sleep you need; allocate 8 hours.
  5. If you need a short productivity boost, work more for 3 weeks. But expect an equivalent reduction in productivity afterwards.

Teams

  1. Work in small cross-functional teams (< 10 people).
  2. Put team members in a dedicated and closed room.
  3. Try not to split people’s time across multiple teams at once.

Thanks to Dan Cook for creating the presentation and Andrew Chen for bringing it to my attention.